Quick tip: store_location with subdomains

Posted by Jon
on Thursday, May 01

Both restful_authentication and the older acts_as_authenticated have a handy method called store_location. This method stores a URL in a session variable for future reference. The obvious use case involves login. For example, if you’re browsing a product anonymously and want to write a review, you’ll need to sign in first. So if you click a link on that product page that requires you to be logged in, and this sends you through the login process, you’ll ideally want to be returned right back to where you were before you logged in. store_location enables this, along with the redirect_back_or_default(), also provided by Rick Olson’s authentication plugins.

You store a location like this:

1
2
3
4
5
6
7

  def private_action
    unless logged_in?
      store_location
      redirect_to login_path
    end
  end

After authenticating the user, you send them back to the stored location with this:

1
2
3
4
5
6

  def login
    if login_successful? # pseudocode, obviously
      redirect_back_or_default(home_path)
    end
  end

If a location is stored in session, redirect_back_or_default will send the user that location. Otherwise, it redirects to the default path.

This is pretty handy. But unfortunately, it doesn’t jump across domains, including subdomains. Tumblon lets parents set up blogs for their families, and these blogs are either identified by a subdomain (e.g. myfamily.tumblon.com) or by a top-level domain (coming soon). Tumblon also has privacy controls, so I can set a story to be viewable only by my family and friends. So if an anonymous user hits the URL of a private photo/story/video, they should be redirected to the login screen and then right back to the item they were trying to view. But out of the box, store_location can’t handle this.

Let’s look at the store_location method to see why. This method is in lib/authenticated_system.rb.

1
2
3
4

    def store_location
      session[:return_to] = request.request_uri
    end

store_location uses the request.request_uri method, which only provides the relative path (e.g. /photos/932783). So if you login at tumblon.com, store_location won’t return you to myfamily.tumblon.com/photos/932783 – it will send you to tumblon.com/photos/932783. Your app could have logic to redirect from this page to the subdomain, but an easier solution is just to create a new store_location method, like store_location_with_domain. Or you could always override the store_location method to always use request.url instead of request.request_uri if you don’t want a separate method.

1
2
3
4

    def store_location_with_domain
      session[:return_to] = request.url
    end

Put this method in application.rb, and you can now use redirect_back_or_default to hit an exact URL – complete with subdomain, top-level domain, and port.

Quick links

Posted by Luke
on Wednesday, April 30

MySQL’s Over-looked and Under-worked Slow Query Log.

Dynamite is a JRuby interface to Processing.

How to send multipart/alternative e-mail with inline attachments.

Prototip and Starbox are awesome.

Note to self: alias_method_chain doesn’t work with ActiveRecord attributes.

Rails Search Benchmarks comparing Ferret, Solr, and Ultrasphinx.

Extend String to use ActionView’s Text Helpers. I may have to add this to my standard bag of tricks. Wish it was in core. Fortunately, in Edge Rails, the helpers are now accessible by module. Nice!

ar_mailer: how to avoid memory related issues and Running ar_sendmail with monit.

Timeframe is a totally awesome looking Javascript date picker.

Datejs parses human dates in JavaScript. Very cool.

Promise and Peril for Alternative Ruby Impls. JRuby’s Charles Nutter takes a look at the state of the alternative Ruby implementations and the challenges they face. I hadn’t heard of MacRuby before—sounds like it will be a great way to write Mac OS X apps. MagLev (Ruby with Smalltalk VM technology) also sounds interesting (interview) but I imagine it will cost booku bucks.

Seed Fu is a new library for loading seed data.

Working with others: Best practices for Rails teams

Posted by Luke
on Friday, February 08

This is the HTML version of the handout from my acts_as_conference presentation. If and when they post the audio of the talk, I’ll upload my slides; they wouldn’t make much sense without it. —Luke

Software is hard. Why? Fred Brooks separated software development into essence (the conceptual modeling) and accident (actually building it). As building software has gotten easier due to better tools, the essence has remained difficult. One of the best things about Rails is that it lets fewer people get more done. This helps because adding more people to a project doesn’t scale. But Rails teams still run into problems. What are they and how can we make Rails development easier?

Migrations

Having a way to evolve your database schema baked into the framework is a huge advantage. But they are also a source of pain for Rails teams.

Migration conflicts happen when two people check in a migration with the same number, and then everyone on the team has to manually fix their database. You can fix this with a pre-commit hook or use a plugin that allows duplicate migrations.

I call the seemingly inevitable tendency of migrations to stop working migration decay. You can fight it, if you try hard enough (continuous integration is your friend here). Or you can give up:

Note that this schema.rb definition is the authoritative source for your database schema. If you need to create the application database on another system, you should be using db:schema:load, not running all the migrations from scratch. The latter is a flawed and unsustainable approach (the more migrations you’ll amass, the slower it’ll run and the greater likelihood for issues). —db/schema.rb

Seed data

Migrations seem great for loading data, because they run automatically. However, touching your models in your migrations increases the chances that they’ll break. Whatever you do, don’t use db:fixtures:load for this—that’s for your tests.

Solution: use a rake task that loads seed data. Jeffrey Allan Hardy wrote a nice task to do this using fixtures. I don’t like fixtures for loading seed data because they don’t validate data, so I use db-populate by Josh Knowles , along with ActiveRecord::Base.create_or_update:

1
2
3
4
5
6
7
8
9
def self.create_or_update(options = {})
  id = options.delete(:id)
  record = find_by_id(id) || new
  record.id = id
  record.attributes = options
  record.save!
    
  record
end

Managing third-party code

The best way to reduce the difficulty of writing code is to not write it. Ruby’s great library of gems and the hundreds of Rails plugins help us write less code. However, managing third-party code is a pain. If a developer installs a gem, everyone else needs it. We used to have problems with this, but now we vendor everything. Dr. Nic’s gemsonrails makes vendoring gems easy—except for gems that must be natively compiled.

Don’t install plugins with svn:externals unless you want to rely on Joe Bob’s Random Subversion Server to be up when you’re deploying. Don’t edit plugins, unless you’ve got a good SCM or use piston. Even then, it may be better to keep your changes as monkey patches in the lib directory.

Security should be on by default

In some areas, you have to screw up to get insecure code. For example, it’s easier to use secure SQL queries than not. And cross-site request forgery protection is baked in.

Not so with cross-site scripting (XSS). You have to remember to use h() in your views. If you forget once, your site could get hacked. My xss_terminate plugin solves this by stripping HTML from all strings when the model is saved (you can override this for attributes that need HTML). We also use Erubis to auto-escape HTML in views.

(Other XSS plugins include: Cross Site Sniper sanitize_params, SafeERB, and xss-shield. If you don’t like xss_terminate, try one of the others!)

Mass assignment code like LineItem.new(params[:line_item]) may set attributes (like total_price) you don’t anticipate—and a malicious user could end up getting charged $0.01 for a MacBook Pro. Protect your attributes with attr_protected. Better yet, use attr_accessible to create a white list of explicitly allowed attributes. Best yet, do this for all models by default. We added this code in an initializer to protect all the attributes:

ActiveRecord::Base.send(:write_inheritable_attribute, "attr_accessible", [])

Then we set attr_accessible to override the protection. This will wreak havoc on an existing code base, but I think it’s a good policy for new sites.

Source control management: the heart of your project

Code is communication, and the way code has changed over time is communication, too. I rely on annotate, log, and diff extensively when figuring out why code works the way it does. To help your team members, you must write informative log messages: what changed, why, and the bug number (if applicable). Commit atomic changes; don’t patch bomb a bunch of unrelated code.

Bug tracking

The bigger your team, the more important a good bug tracking system is. What makes a good bug tracker? The features I look for are: workflow with open, closed, and resolved states; e-mail integration; SCM integration; and shared saved searches. The most important feature is ease of use, because if it’s not easy, people won’t use it.

Continuous integration

Continuous integration (CI) builds your software every time someone makes a change. CI runs tests, but more than that it ties everything together. It simulates a deployment: checks out your code, creates the database and loads the seed data, and ensures all the necessary libraries are there. If you broke something, CI will let you know right away. CI takes awhile to set up and may be overkill for small teams, but on larger teams, CI is especially helpful.

Does it matter?

The practices above smooth over rough patches, automate processes, and manage communication. For the most part, they attack the accident of Rails development. But the majority of what makes software hard is figuring out what to build: the essence. Fred Brooks argues that to make real improvements in software productivity, we must attack the essence. To get at the essence, do what you can to limit complexity: use existing open source or packaged software; do less (a la 37Signals); split complicated projects into smaller pieces.

Iterative development

Iterative development is the most important software development technique. You can write tests all day long, but if you’re building the wrong thing, they won’t help you. With iterative development, your understanding of what you are trying to build grows with time and feedback from the customer (if you aren’t getting regular feedback, it’s not iterative). There is a difference between incremental and iterative. Iteration is the process of continuous refinement; incremental is building in stages. We try to do both: build smaller pieces, and iteratively refine the software.

Loading seed data

Posted by Luke
on Thursday, January 31

At acts_as_conference next week (there’s still room to register) I’m going to be talking about challenges facing Rails teams. Today, I’d like to talk about loading your application’s seed data.

Seed data?

Seed data is anything that must be loaded for an application to work properly. An application needs its seed data loaded in order to run in development, test, and production.

Examples include everything from an initial administrator account to small enumerations to huge amounts of data (one example of seed data given by a developer on the Ruby Users of Minnesota included every airport in the world).

Seed data is mostly unchanging. It typically won’t be edited in your application. But requirements can and do change, so seed data may need to be reloaded on deployed applications.

The ideal solution would be automatic: you shouldn’t have to think about it. When you check out the code and start up your app, it should be ready. It should provide data integrity: the created records should pass your validations. And it should be easy to update your seed data.

Migrations

Since migrations are just Ruby code, they can be used to initialize data in the up method. This is demonstrated in the Rails documentation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class AddSystemSettings < ActiveRecord::Migration
  def self.up
    create_table :system_settings do |t|
      t.string  :name
      t.string  :label
      t.text  :value
      t.string  :type
      t.integer  :position
    end

    SystemSetting.create :name => "notice", :label => "Use notice?", :value => 1
  end

  def self.down
    drop_table :system_settings
  end
end

Using migrations is attractive because they get run automatically.

However, they have some downsides. Adding or changing data is troubling. Adding a new migration seems annoying. But going back into your old migrations to change your data won’t work either.

The biggest problem with using migrations to load seed data is “migration decay.” The more migrations you have, the less likely the older ones are to work. If your migrations load data, they are more likely to break as your models change.

Furthermore, the movement in the Rails community is that schema.rb is the authoritative source of your DB schema, and that new databases should be created using that:

Note that this schema.rb definition is the authoritative source for your database schema. If you need to create the application database on another system, you should be using db:schema:load, not running all the migrations from scratch. The latter is a flawed and unsustainable approach (the more migrations you’ll amass, the slower it’ll run and the greater likelihood for issues).

That means no data loading migrations can be run.

Fixtures

At first glance, fixtures seem well suited for loading data. And because of that, a lot of projects go down the primrose path of using them—usually with poor results.

There are two ways to use fixtures to load seed data.

First, simply use test fixtures with rake db:fixtures:load. This is almost certainly a mistake. Your test fixtures will contain data not necessary for your application.

Second, create a separate set of fixtures, unrelated to your tests, and load those. Jeffery Allan Hardy has a good post about how to use fixtures to load seed data. This is better, but I don’t like fixtures because they don’t validate data. It’s way too easy to end up with broken models.

One caveat about seed data, fixtures, and tests: If you use fixtures for tests, your data is deleted and the fixtures loaded. So your fixed seed data needs to be duplicated in the fixtures.

Fixture scenario builder

I haven’t used this one myself, but a number of people on the RUM list recommended using Chris Wanstrath’s Fixture Scenario Builder as a way to use fixtures without sucking (see above).

The Fixture Scenario Builder, uh, builds on Fixture Scenarios, letting you define them in Ruby (so they’re valid) and then generating fixture files for loading. Most people use this for test cases, but it can be used to load your initial data as well.

ActiveRecord::Base loader

If only there were some way to create records that were valid. Oh wait, ActiveRecord does this. Why not write a task that loads the seed data with ActiveRecord?

You’d have to make sure this gets run whenever you set up a new application. Josh Knowles has created db-populate to facilitate this approach. It provides a db:populate rake task that will run Ruby files in the db/fixtures directory.

Here’s a helper method that makes it easy to create or update records, so it can be run regardless.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class ActiveRecord::Base
  # given a hash of attributes including the ID, look up the record by ID. 
  # If it does not exist, it is created with the rest of the options. 
  # If it exists, it is updated with the given options. 
  #
  # Raises an exception if the record is invalid to ensure seed data is loaded correctly.
  # 
  # Returns the record.
  def self.create_or_update(options = {})
    id = options.delete(:id)
    record = find_by_id(id) || new
    record.id = id
    record.attributes = options
    record.save!
    
    record
  end
end

You can use it like this (in db/fixtures/venues.rb):

1
2
Venue.create_or_update(:id => 1, :name => "Coffman Union")
Venue.create_or_update(:id => 2, :name => "Alumni Center")

If you need to change the data, just edit the file:

1
2
3
Venue.create_or_update(:id => 1, :name => "Coffman Union")
Venue.create_or_update(:id => 2, :name => "McNamara Alumni Center")
Venue.create_or_update(:id => 3, :name => "Lind Hall")

I like this approach. The data is validated by ActiveRecord. It’s easy to update, and you can add it to your deploy recipe to make it automatic.

Loading lots and lots of data

I’ve read both fixtures and ActiveRecord data loaders are too slow if you have lots of data (See Tonkatsufan’s comment here). In that case, the best thing to do is use your database’s preferred method of batch loading SQL inserts.

Your method here?

So that’s my survey of the available methods of loading seed data. I’m interested to hear what other people out there are doing. How do you do it?

Auto-escaping HTML with Rails

Posted by Luke
on Monday, January 28

One of the things I don’t like about Rails is that it doesn’t auto-escape HTML in user input. Forget one h call in your template and you’re screwed. Worse yet, before Rails 2.0, strip_tags and sanitize were flawed. Fortunately that’s been fixed. Django added auto-escaping even though it was a backwards incompatible change, but so far there doesn’t seem to be similar movement on the Rails front.

But I’m all about automating manual processes. So let’s fix this problem.

Sanitize before saving or before displaying? Or both?

Should you sanitize text before saving it or before displaying it?

It’s nice to not need to worry about doing anything extra in your views. However, if a field escapes your notice, you may be open for an attack.

I think your first line of defense should be model-level sanitization, but auto-escaping HTML is good backup. Doing both covers your bases at a cost of extra processing.

Introducing xss_terminate

xss_terminate is a plugin in that makes stripping and sanitizing HTML stupid-simple. It’s install and forget. And you can forget about forgetting to h() your output, because you won’t need to anymore. It’s based on acts_as_sanitized by Alex Payne but updated for Rails 2.0, and with some new features.

I like acts_as_sanitized but it’s not being maintained any more so Alex gave me the OK to take his code and do something different with it. Here’s what makes xss_terminate different:

  • It works with Rails 2.0.
  • It’s automatic. It is included with default options in ActiveReord::Base so all your models are sanitized. Period.
  • It works with migrations. Columns are fetched when model is saved, not when the class is loaded.
  • You can decide whether to sanitize or strip tags on a field-by-field basis instead of model-by-model.
  • HTML5lib support if Rails’s HTML parser isn’t doing it for you.

Here’s how you use it.

To install: script/plugin install http://xssterminate.googlecode.com/svn/trunk/xss_terminate

Strip HTML tags from all the fields in a model

1
2
class Article < ActiveRecord::Base
end

Done. All models have tags stripped by default.

Sanitize HTML from some fields

1
2
3
class Article < ActiveRecord::Base
  xss_terminate :sanitize => [:body]
end

Use HTML5lib to sanitize HTML from some fields

HTML5lib is a new library for parsing HTML for Python and Ruby. Its goal is to parse HTML like browsers do, so it’s very fault-tolerant. If you want to use it, gem install html5 and use the :html5lib_sanitize option. This is thanks to code by Jacques Distler.

1
2
3
class Article < ActiveRecord::Base
  xss_terminate :html5lib_sanitize => [:body]
end

But I don’t want to strip HTML at all from that field!

1
2
3
class Article < ActiveRecord::Base
  xss_terminate :except => [:title, :body]
end

Putting it all together

And of course, you can put these options together. Remember, fields are stripped of tags by default, so that’s assumed unless you override it.

1
2
3
class Article
  xss_terminate :except => [:author_name], :sanitize => [:title], :html5lib_sanitize => [:body]
end

Report bugs at the xss_terminate Google Code site.

Extra credit: Use Erubis

Erubis catches 80% of HTML escaping screw ups by making them impossible. You can use it in conjunction with xss_terminate or other XSS plugins to give yourself an extra layer of protection. (See our post on setting up Erubis with Rails 2.0.)

With Erubis, code like <%= "<script>alert('pwnd')</script>" %> can be auto-escaped.

However, all Rails helpers which generate HTML must be called with <%== %> so the HTML is not escaped. This leaves an opening for attacks like this:

<%== link_to user.name, "/some/url" %>

If user.name contains XSS you’re pwnd.

So while Erubis is a marked improvement over Erb it’s not a cure-all. That’s why I like to use both approaches.

Other approaches

There’s been a lot of discussion about Rails and XSS lately, so I’m hopeful that the situation will get better. Here’s a couple other XSS protection projects you can check out:

  • SafeERB – Throws exceptions if you try to display tainted strings. Call h() to untaint.
  • xss-shield – automatically h() strings unless marked as “safe”.
  • sanitize_params – strip HTML from your parameters before they hit your models.
  • AntiSamy – another whitelist-based approach (not available for Rails)

Also, check out Is your Rails App XSS Safe? and Never Untaint by Stu Halloway and Jacques Distler’s posts about making Instiki XSS-safe: XSS and XSS 2 (these are must read).

Fuzzing your database for fun and profit

Posted by Luke
on Friday, January 25

Fuzz testing is throwing random data at your application and seeing what breaks. We don’t usually do that. But we often do need lots of semi-realistic data added our development database.

This helps you:

  • see how things will look when there’s more in the site.
  • nail down the indexes you’ll need (Queries that run fine with 10 rows of fixture data fall down on 10,000 rows of random data).

It’s possible to do this with fixtures and ERB but I find it tedious. Plus by using Active Record directly you can guarantee that the objects you’re inserting are valid.

First, create a new rake task in lib/tasks/fuzz.rake:

1
2
3
4
5
6
7
8
9
10
11
namespace :db do
  desc 'Insert some random posts'
  task :fuzz => :environment do
    if RAILS_ENV.downcase == "production"
      raise "You can't fuzz your production environment. Think of the children!"
    end
    
    Fuzz.execute(ENV['SIZE'].to_i)
    
  end
end

You’ll call this with rake db:fuzz SIZE=1000. You can actually put all the code in the rakefile, but it’s a little easier to manage to split it out into a separate class.

In lib/fuzz.rb, write something like this example, which finds a random user and adds a post from them to the system SIZE times. The fuzz script could do anything you want, though.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
class Fuzz
  ActiveRecord::Base.establish_connection(RAILS_ENV.to_sym)

  # This file location varies by OS. This is the Mac OS X location.
  # At 2.4M, you have plenty of RAM to read it all into memory!
  @@words = File.open("/usr/share/dict/words").collect do |line|
    line.strip
  end
  
  def self.execute(size)
    if size == 0 or size.nil?
      size = 100
    end

    ActiveRecord::Base.silence {
      User.transaction do
        size.times do
          user = User.find(:first, :order => "rand()")
          user.posts.create!(:body => random_words(rand(30))
        end
        puts "Created #{size} posts"
      end
    }
  end
  
  # provide a string with num random words in it.
  def self.random_words(num = 1)
    w = []
    num.times do
      w << @@words[rand(@@words.size)]
    end
    w.join(" ")
  end
  
end

Silencing the logger and using a transaction makes the code execute faster. Which can be a problem if you’re running 10,000 of these. Another thing you can do to speed things up is disable timestamps, but I’ve found that causes more trouble than it’s worth, because you often want to use those timestamps in your app!

Extra credit: While the data generated from random dictionary words is often hilarious, it’s not very realistic. Use Faker to create more realistic fake data and sometimes to randomize those non-required fields.

Managing Migrations

Posted by Luke
on Monday, January 14

This is part of a series in which I am exploring the best practices for working together on a Rails team in advance of my acts_as_conference presentation. Earlier, I wrote about working together effectively and argued that bigger projects require better tools.

Migrations are great, but they are not without problems. Here’s some I’ve run into and how to fix or avoid them.

Migration Decay

You’ve been assigned to fix some bugs in a project your company built a while ago. You update the code and then run rake db:migrate.

rake aborted! (See full trace by running task with --trace)

D’oh.

I can this seemingly inevitable breakdown “migration decay.”

This goes against the general principles we’ve been talking about. It’s not automatic. It adds barriers when you add new members to a project, stressing the communications channels.

Note that this schema.rb definition is the authoritative source for your database schema. If you need to create the application database on another system, you should be using db:schema:load, not running all the migrations from scratch. The latter is a flawed and unsustainable approach (the more migrations you’ll amass, the slower it’ll run and the greater likelihood for issues).

db/schema.rb

Idealisticly, I’m for using migrations and keeping them running from version 0 to version n. Migrations are cool because they can do it all: change your schema, load your seed data, and modify data on the server.

But as your models change, the old migrations don’t. And so it’s very likely that you’ll end up with migrations that stop working.

With enough work, you can probably keep them running—most of the time. But I’ve just been in too many situations where the migrations have failed. Worse is if they fail half-way in, because the migration number’s not incremented, but half of it has been applied. So the next run will fail, too, and then you have to fix the database manually. Bleh.

This has lead me to re-evaluate the use of schema.rb. Since schema.rb can’t load data, that means you need to separate migrations and data loading.

Re-organizing migrations

I’ve worked on a couple of projects where the existing migrations were consolidated into a single migration at the end of the development cycle.

This seems like too much trouble to me. Especially if we are giving up on migrations as the way to create the database. Who cares how many there are in that case? Plus you have to screw around with the server to re-set the schema_info table.

rake db:reset

On my latest project, I have been editing my migrations as needed and then re-creating the database with rake db:reset db:migrate.

I can only do this because it’s a small project and we haven’t deployed yet. But it does keep the migrations much cleaner and easier to understand.

Conflicting migrations

Anyone ever have this happen?

Chances are if you’ve worked on a project with more than a couple developers, you’re going to get conflicting migrations checked into your source control. And the more developers you have, the more time your whole team will waste cleaning up afterwards. Branches present another problem for migrations.

The root problem is that migrations layer another level of version control on top of your SCM. For a really insightful look at this thorny problem, I recommend reading the Django project’s Schema Evolution wiki page, particularly this part. (No, they haven’t solved it either.)

Some potential solutions:

  1. Use a SCM hook to prevent checking in conflicting migrations. There is a pre-commit hook that does this for Subversion. This wouldn’t work for branches, though.
  2. Use a plugin that extends migrations so they aren’t based (exclusively) on numbers. ELC Technologies has a plugin called Duplicate Migrations that allows this. There have also been a few other attempts, most notably Enhanced Migrations from Revolution Health and Independent Migrations by Courtenay.
  3. Write all the changes directly in schema.rb and let Auto-migrations take care of the database while your SCM keeps track of the merging.

I find the last option attractive, but I’m not sure I’d trust it for production use.

I really like the Duplicate Migrations plugin. Not only can developers concurrently change the database without problems, but development could continue on a maintenance branch without causing problems for the mainline. I plan to use it on my next big project.

I’ve also used the SCM hook approach with great success (within a single branch of development). There, the person who checks in their code “last” has to fix the problem, so it costs them some time. But it doesn’t kill everyone else’s time, too.

Your $0.02 here…

So, how do you deal with migrations? Let me know in the comments. Thanks!

Bigger projects require better tools

Posted by Luke
on Sunday, January 13

We’ve seen how having more developers on a project increases the number of communication channels dramatically.

Project size also has a direct impact. To quote Code Complete:

Project size is easily the most significant determinant of effort, cost and schedule [for a software project]. People naturally assume that a system that is 10 times as large as another system will require something like 10 times as much effort to build. But the effort for a 1,000,000 LOC system is more than 10 times as large as the effort for a 100,000 LOC system.

For any non-trivial program, it is impossible to keep the whole thing in your head at once, no matter how smart you are. So in order to build software, we have to constantly battle against complexity. We split things into pieces; we automate processes so we don’t have to think about them; we document so we can remember later or delegate to others.

When it’s just me working on a project, I can track bugs on a piece of paper; there are never any problems with merges or conflicting migrations; and I know exactly who broke the test cases. As the number of developers increases, I need better tools. I need a real bug tracker, continuous integration to run the tests, and a way to deal with conflicting database changes.

What is the bearing of this on Rails? My goal is to look at how we can smooth over the bumpy edges of building a Rails project. These are the things that trip you up, that break your flow.

  • Managing migrations
  • Loading your initial data
  • Managing third-party code
  • Perhaps more…

Finally, I’ll try to answer the question: does it even matter? That is, will you get enough of a productivity boost to make these techniques worth it?

Rendering with Erubis and Rails 2.0

Posted by Eric
on Monday, December 10

Update: ActionView has been refactored in Rails 2.0.2, making Erubis’ Rails helper, and the “Create an Erubis Initializer” section of this article, obsolete. See the comments for a Rails 2.0.2 compatible initializer. Thanks for the tip, Jason!

Erubis is a drop in replacement for Erb. Among its many features are a few notable improvements in terms of speed and security (it optionally supports auto-html escaping).

Sample Erubis Syntax:
1
2
3
4
5
# Erubis with auto HTML escaping enabled:

Hello, <%= current_user.name %> # equivalent to h(current_user.name)

<%== render :partial => 'user' %>

Installing Eribus:

1. Install the gem


gem install erubis

2. Create an Erubis initializer

app/config/initializers/erubis.rb
1
2
3
4
5
6
7
8
9
10
11
# Via http://www.kuwata-lab.com/erubis/users-guide.05.html#topics-rails
# The above link also references an optional patch that can be applied.

require 'erubis/helpers/rails_helper'

# These are optional settings:
Erubis::Helpers::RailsHelper.init_properties = { :escape => true, :escapefunc => 'h' }

# Erubis::Helpers::RailsHelper.engine_class = Erubis::Eruby # or Erubis::FastEruby
# Erubis::Helpers::RailsHelper.show_src = false
# Erubis::Helpers::RailsHelper.preprocessing = true

3. Create custom rescue templates

The default Rails debug views need to be slightly modified to support Eribus. This problem only pops up in a few spots, but Eribus doesn’t handle inline statements:

1
2
3
4
5
6
7
# Default Rails sample:
<%= request.parameters["controller"].capitalize if request.parameters["controller"] %>

# Erubis compatible rewrite:
<% if request.parameters["controller"] %>
<%= request.parameters["controller"].capitalize %>
<% end %>

If auto-escaping is enabled, all instances of <%= need to be replaced with <%== and <%=h replaced with <%=.

Step 1: Download these auto escaped, Eribus-friendly templates and put them in app/views/rescues

Step 2: Redefine the rescues path to point to the modified templates:

controllers/application.rb
1
2
3
4
# For Erubis compatible debug templates  
def rescues_path(template_name)
  "#{view_paths.first}/rescues/#{template_name}.erb"
end

Using Erubis

Auto escaped Erubis vs Erb

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Erubis doesn't handle inline parsing well.
# While in Erb, this would work:
# <%=h status = current_user.status if returning_user? %> 
# It needs to be broken out in Erubis:

<% if returning_user? %>
  <%= current_user.status %> # Auto HTML escaped
<% end %>
                                                                          
Here's a list of your friends, <%= current_user.first_name %>        # Auto-escaped
<ul>
  <%== render :partial => 'friend', :collection => @friends %>       # Not escaped
</ul>
<p><%== submit_tag "Add a friend", :class => "button" %></p>         # Not escaped
<% end %>
A few things to remember:
  • Erubis doesn’t handle inline statements well
  • If auto escape is enabled, use the <%== operator when rendering partials and helpers
  • Update app/views/layout templates
References

Measuring your test coverage with Heckle and RCov

Posted by Jon
on Thursday, November 29

I gave a presentation at RUM on Monday about code metrics. In particular, I showed tools for measuring two aspects of code: test coverage and complexity. Here are my slides.

Saikuro and Flog measure code complexity. Saikuro measures cyclomatic complexity, the number of independent paths through a method. Flog, on the other hand, parses your code and assigns a complexity value to assignments, branches, and calls. The goal, of course, is to minimize code complexity. This is an important goal, but I’m not sure yet what I think of these measurement tools. I haven’t used them enough to know if they have practical value.

Heckle and RCov on the other hand, are useful. I’m going to look at each in more detail here.

RCov

RCov measures C0 code coverage. That is, it runs your test suite, and looks at what lines of your application were run or not run. It then gives you a nice HTML report with red and green lines – red for lines of code that are not run, and green for lines that are run.

If your test suite doesn’t execute a line of your application code, it is safe to say that that line is not tested. On the other hand, if a line of your application is run, it is NOT safe to say that it IS tested. A test method with no asserts works just fine for RCov’s purposes, thank you very much. Take a look at this code.

def test_user_assignment
  User.assign
end

This test is enough to mark the User.assign method as tested. But nothing is asserted, and so nothing is tested. The problem is equally true even if you aren’t in the habit of writing tests without assertions; you may make assertions about some aspects of a method, but forget about other aspects. And RCov won’t tell you this.

Logically speaking, RCov tells you that if line_is_red, then !line_is_tested. From this, you can also infer the contrapositive: if line_is_tested, then !line_is_red. But that’s all you know. If a line is green, RCov tells you nothing at all. Saying if !line_is_red, then line_is_tested is a formal fallacy (denying the antecedent). And that’s bad.

So 100% RCov coverage is not equal to 100% test coverage. In fact, the two have nothing to do with each other. Your code could have 100% or 95% or 75% RCov coverage, and be extremely poorly tested.

In my experience, RCov is a one-time tool. That’s because green lines in RCov don’t tell you anything at all about your test coverage. Red lines provide the real value. If you run RCov, find an untested method, and write up a quick test hack that provides C0 coverage, RCov will never complain about that method again. It will be off your RCov radar. This is too bad, because it is really useful to know what is poorly tested. So whenever you see red in RCov, take the time to write comprehensive tests to cover the untested code.

Heckle

Heckle is a mutation tester that changes your code and checks to see whether your tests catch the changes. If Heckle is able to change instances of true to false (or 32 to nil, or remove method calls) in your application without creating a test failure, then your code isn’t tested well enough. To run it effectively, do this:

heckle Class method -t /test/units/class_test.rb -T 30

heckle is the tool, installed as a Ruby gem. Class is the name of the Ruby class you want to heckle. method is a method on the class; you can leave this out, but I don’t recommend it. -t /test/units/class_test.rb is the path to the unit test you want to use (also optional). Finally, -T 30 specifies a timeout for the test, in case your mutation creates an infinite loop.

You can leave out the last three options and just run Heckle with a class:

heckle Class

But I don’t recommend it.

First, it will take forever.

Second, you may run into infinite loops.

Third, heckle will unfortunately test EVERY method available to a class, including methods included by modules, superclasses, etc. So if you’re heckling an ActiveRecord class, you’re going to see dozens of Rails magic methods, not just the methods that you wrote.

Fourth, your UserTest should cover your User class on its own, if your code is well written and well tested; it shouldn’t rely on the ProductTest class (or another test). One problem with Heckle is that it doesn’t distinguish between well tested code and highly coupled code, where a small change somewhere causes the application to fall apart somewhere else. This problem can be minimized by only comparing a single method to a single test class.

I like Heckle and find it pretty useful. Unfortunately, it needs a little developer love. The -T timeout parameter is flaky; it doesn’t always play nice with its dependencies (especially ParseTree 2.0.x, the current version); and it would be more useful if by default it only heckled the methods directly added by a class, not methods brought in through parent classes, includes, or fancy metaprogramming. This is a shame, because it is really a great tool. Hopefully Kevin Clark and Ryan Davis have an update in the works.

Seeking MMS sample data

Posted by Luke
on Monday, November 12

The MMS2R project is pushing forward to a 2.0 release that promises to be easier to use and more Ruby-like.

We want to make this the best MMS library in any language, and to do that we need your help.

We currently support the following carriers. If you don’t see your carrier on this list, please send an MMS message to me at luke@slantwisedesign.com and I’ll work on adding it to the library. We especially need European carriers.

  • Alltel
  • AT&T/Cingular
  • Dobson/Cellular One
  • Helio
  • Nextel
  • Orange (Poland)
  • Orange (France)
  • PXT (New Zealand)
  • Sprint
  • T-Mobile (USA)
  • Verizon

iPhone subdomains with Rails

Posted by Luke
on Thursday, November 08

iPhone! It seems like everyone has one, and those who don’t have one are talking about it. (I fall into the latter category.)

I recently attended an Apple iPhone Tech Talk with some of my colleagues from Slantwise. It was well worth it. I highly recommend going if the talk comes to your area. If you can’t attend, you can watch the videos online. Getting into the nitty-gritty of how to develop for the iPhone and iPod Touch was very interesting, but to me the most useful aspect of the class with the information on the iPhone web application user interface guidelines.

Most examples I’ve seen for how to do a special view for the iPhone suggest something like this:

1
2
3
4
5
6
7
8
9
10
class ApplicationController < ActionController::Base  

  before_filter :adjust_format_for_iphone

  def adjust_format_for_iphone
    if request.env["HTTP_USER_AGENT"] && request.env["HTTP_USER_AGENT"][/(iPhone|iPod)/]
      request.format = :iphone
    end
  end
end

However, Apple’s user interface guidelines for the iPhone suggest against doing user agent sniffing. The reason is that iPhone users are used to being able to use the entire web. They don’t want a limited subset.

Another problem is that when Apple releases a new device, your code will need to be updated to work with it. This actually happened when the iPod Touch was released. Some iPhone sites didn’t work on the iPod Touch because (unlike the code above) they only sniffed for “iPhone”.

When I was at the Apple iPhone Tech Talk, Apple suggested the best way to develop web applications for the iPhone was to provide the full version of your site, with a link to the iPhone web app. The iPhone version should focus on discrete functionality, and look like a native iPhone application. But if an iPhone user ever needs to use the “real” site, it’s just a clicks away. Examples of sites doing this include Facebook and Amazon)

Fortunately, this is still easy to do with Rails.

1
2
3
4
5
6
7
8
9
10
class ApplicationController < ActionController::Base  

  before_filter :adjust_format_for_iphone

  def adjust_format_for_iphone
    if request.subdomains.first == "iphone"
      request.format = :iphone
    end
  end
end

You can test that your subdomain detection works with something like this:

1
2
3
4
5
def test_hitting_app_using_iphone_subdomain_should_set_iphone_virtual_mime_type
  @request.host = "iphone.test.host"
  get :index
  assert_equal :iphone, @request.format.to_sym
end

This is kind of a drag to develop with, so when not in production mode, I sniff based on the user agent like the old way.

1
2
3
4
5
6
7
8
def adjust_format_for_iphone
  if request.subdomains.first == "iphone" || 
     (RAILS_ENV != "production" && 
      request.env["HTTP_USER_AGENT"] && 
      request.env["HTTP_USER_AGENT"][/(iPhone|iPod)/])
    request.format = :iphone
  end
end

With this, I can use iPhoney to test my code on localhost, but the sniffing isn’t used when deployed.

For those times when you do need user agent detection, Apple recommends testing the “Mobile/XX” part of MobileSafari’s user agent string. This will work across iPhone, iPod Touch, and future MobileSafari devices.

Mozilla/5.0 (iPhone; U; CPU like Mac OS X; en) AppleWebKit/XX (KHTML, like Gecko) Version/ZZ Mobile/WW Safari/YY

Here’s an AbstractRequest#iphone? method. You can use in in your views to display a message to iPhone users telling them about your iPhone-optimized site or web application (like Amazon does). But I’m not sure if this is the “Rails way” to do this. Let me know in the comments.

1
2
3
4
5
6
7
module ActionController
  class AbstractRequest
    def iphone?
      self.env["HTTP_USER_AGENT"] && self.env["HTTP_USER_AGENT"][/(Mobile\/.+Safari)/]
    end
  end
end

And of course, I used the above method to refactor my adjust_format_for_iphone method.

For more on this topic, check out the iPhone Dev Center. You have to be an ADC member to look at the content, but signing up is free.

Update: Check out Ben’s Slash Dot Dash article on iPhone on Rails – Creating an iPhone optimised version of your Rails site using iUI and Rails 2 for some new tips and tricks.

Bringing the Rails Magic to Facebook

Posted by Eric
on Friday, November 02

While there’s a few plugin options for the Rubyist looking to write a Facebook application, none of them quite fit our needs. I opted to write one for internal use at Slantwise. Why? A fundemental difference between this code and the publicly available solutions is that its Rails-centric.

If you’re not planning on writing your app in Rails, this isn’t the library for you. The benefits to the Rails programmer is that they now have a Facebook interface that’s borderline indisguishable from other Rails code—meaning its understandable, enjoyable, and doesn’t require hours of pouring through Facebook’s API documentation.

Ruby, where art thou?

This first sample shows some of the library’s low-level functionality, and its pretty similiar to the other solutions out there. It also demonstrates the inherent problems with simply wrapping API calls.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
  # Retrieves photos from the first album that's found for the current user on Facebook
  # Then publishes some of the photos as a news update in Facebook
  ...
  #Retrieve the photos
  recommended_album   = Facebook::API::Albums.get(:uid => current_user.facebook_uid).first
  @recommended_photos = Facebook::API::Photos.get(:aid => album[:aid])

  # Assign all the necessary API params for publishing to Facebook.
  # This is a more complex (and abbreviated) API call
  # A typical update call will take at least a few lines of code
  feed_body_template = render_to_string(:partial => 'facebook/_body_template')
  feed_body_general  = render_to_string(:partial => 'facebook/_body_general')
  ...
  recommended_photos_feed = { :actor => current_user.facebook_uid, 
                              :body_template => body_template,
                              :body_general   => ... ,
                              ... }
  
  #Publish the feed to Facebook
  Facebook::API::Feed.publish_templatized_action @recommended_photos_feed

Direct API calls are great in some instances, but not in most. Writing something like the following could easily take someone new to the Facebook API hours to figure out. It also takes the magic out of working in Rails—increasing development time and developer frustration and ultimately resulting in a lesser product. Let’s bring that magic back and put a smile on our developer’s face while we’re at it:

Hello, old friend

1
2
3
4
5
  
  recommended_album   = current_user.facebook.albums.find_by_name 'Halloween'
  @recommended_photos = recommended_album.photos

  update_facebook 'recommended_photos.feed'

The first line of code is a Facebook API call. It utilizes a basic ActiveRecord connection that’s adapted to run through the Facebook API, so the results behave exactly the same as any other ActiveRecord request.

The update_facebook method is a specialized version of render that looks for a template named ‘recommended_photos.feed’, fills it with data, and sends it to Facebook. In this case, the ’.feed’ extension maps to the Facebook API call Feed.publishTemplatizedActionOfUser. The rendering action is easily configurable to support any extension/API call combination.

Interested?

The plan is to release a development version of the plugin within the next week or two to iron out any glaring bugs, followed very shortly thereafter with a production version that you can freely use in your projects. I’ll announce the plugin release here when it’s available.

Get Involved

While I think I’m off to a good start, I’m looking for help. I’m on the look out for highly useful features. If you’re developing a Facebook application and have any problem areas, send me an example of some problem code and I’ll look into incorporating a solution into the plugin.

My First Rails App

Posted by Luke
on Saturday, October 27

I’ve recently have the “opportunity” to do some work on the first Rails app I developed professionally, as well as an existing codebase that’s in a similar state.

We’ve all been there. Trying to understand old code, refactoring, adding features and tests. It’s painful. And considering how much Rails has changed in the last 2 years, it’s no wonder that working with apps written for older versions is no fun.

Here’s a few “worst practices” that I’ve encountered in my own and others’ old code:

  • Using url_for instead of named routes and RESTful routes. Just say no to relying on Rails default, implicit routing. I even delete the default route from routes.rb.
  • <% @objects.each do |obj| %>...<% end %> instead of <%= render :partial => 'object', :collection => @objects %>
  • Lots of if blocks in application.rhtml instead of creating new layouts.
  • Not using content_for :whatever to insert code in layouts.
  • Not following Rails convention, like misnamed controllers: UserController instead of UsersController
  • Practically any use of scaffolding (my beef with scaffolding: code generation results in a lot of views you ultimately won’t need, which then get in your way.).
  • Not using Rails association methods: Thingy.find(owner.thingy_id) instead of owner.thingy
  • Similarly, not using the built in create and build methods on associations: Employee.create(:employer => employer, :name => "Bob") instead of employeer.employees.create(:name => "Bob")
  • Security issues. XSS and attributes that should be attr_protected first among them.
  • Bad or non-existent tests.
  • Deprecated code. It probably wasn’t your fault, but given the pace of change in Rails, working with an old app almost certianly means working with code that’s now deprecated.

Do you have any of your own favorite newb mistakes? Add ‘em in the comments.

Photo by chaim zvi.

RVideo 0.9 is now available

Posted by Jon
on Tuesday, October 02

RVideo is now available as a Ruby gem. Install with:

sudo gem install rvideo

(RVideo depends on other tools for transcoding, like ffmpeg, so you’ll probably need to install a few other things as well. See the Documentation for a little more detail.)

I’ve tagged this release as 0.9.0. It is still beta-quality code, so test thoroughly. If you run into problems, let me know – I’ll be deploying RVideo to a live app soon, so I want to squash any bugs as much as you do. :)

What is it?

RVideo is a Ruby library for video/audio transcoding. It provides a clean Ruby interface to transcoding tools like ffmpeg, and can easily be extended to support more tools. At this point, only ffmpeg and flvtool2 are supported, but more will follow.

1
2
transcoder.execute(recipe, {:input_file => "/path/to/input.mp4",
      :output_file => "/path/to/output.flv", :resolution => "640x360"})

Details

To inspect a file, initialize an RVideo file inspector object. See the documentation for details.

A few examples:

1
2
3
4
5
6
7
8
9
  file = RVideo::Inspector.new(:file => "#{APP_ROOT}/files/input.mp4")

  file = RVideo::Inspector.new(:raw_response => ffmpeg_inspection_response)

  file = RVideo::Inspector.new(:file => "#{APP_ROOT}/files/input.mp4",
                                :ffmpeg_binary => "#{APP_ROOT}/bin/ffmpeg")

  file.fps        # => "29.97"
  file.duration   # => "00:05:23.4"

To transcode a video, initialize a Transcoder object.


  transcoder = RVideo::Transcoder.new

Then pass a command and valid options to the execute method

1
2
3
4
5
6
7
8
9
  recipe = "ffmpeg -i $input_file$ -ar 22050 -ab 64 -f flv -r 29.97 -s"
  recipe += " $resolution$ -y $output_file$"
  recipe += "\nflvtool2 -U $output_file$"
  begin
    transcoder.execute(recipe, {:input_file => "/path/to/input.mp4",
      :output_file => "/path/to/output.flv", :resolution => "640x360"})
  rescue TranscoderError => e
    puts "Unable to transcode file: #{e.class} - #{e.message}"
  end

If the job succeeds, you can access the metadata of the input and output files with:

1
2
  transcoder.original     # RVideo::Inspector object
  transcoder.processed    # RVideo::Inspector object

If the transcoding succeeds, the file may still have problems. RVideo will populate an errors array if the duration of the processed video differs from the duration of the original video, or if the processed file is unreadable.

RVideo supports any transcoding tool with a command-line interface; adding a new tool just means writing a class for the tool that subclasses RVideo::AbstractTool. It also means that you need to use common sense to avoid attacks. For example: don’t run RVideo as a privileged user. Control your input recipes, and don’t accept user-submitted recipes. (RVideo is pretty well protected from these problems; you can’t execute a command that isn’t identified by a transcoder tool class, so `rm -rf *` won’t work. But it pays to be cautious.)

More info

See the RVideo Google Code site for more info, including links to Documentation and a Google discussion group. Use these to file tickets, discuss, etc. (The SVN repository is currently at Rubyforge, but I may move it to Google Code.)

Contribute

I would love help on this project. If you want to help out, there are a few things you can do.

  • Use, test, and submit bugs/patches
  • We need a RVideo::Tools::Mencoder class to add mencoder support. (Someone has started on this, so let me know if you’re interested in helping and I’ll put you in touch.)
  • Other tool classes would be great – On2, mp4box, Quicktime (?), etc.
  • Eventually, it would be great to (optionally) use the processing feedback provided by ffmpeg etc. to get real-time progress updates (e.g. 20% complete, 40% complete, 90% complete). (More info)
  • Submit other fixes, features, optimizations, and refactorings