Class table inheritance in Rails
I enjoyed reading a post by James Coglan on the FutureLearn blog about refactoring a Rails app out of single table inheritance. If you haven’t read it then go and read it now because it’s a great walkthrough of some of the problems of STI and how FutureLearn managed to migrate away from the pain.
I found the article really interesting because I’ve recently dealt with a database schema that is almost identical to the one they describe. It was also for an e-learning site and the domain model was pretty much the same - courses comprised of an ordered collection of different types of content.
James writes about how the project originally used a STI approach to store the different content types and how they refactored to split each of the content types into their own table.
Schema before
+--------+ +---------+
|courses | |steps |
+--------+ +---------+
| id | |id |
| title | |type |
+--------+ |course_id|
|position |
|title |
|body |
|asset_id |
|url |
|copyright|
+---------+
Refactored schema
+--------+ +------------+ +---------+ +---------+ +---------+
| courses| |steps | |articles | |videos | |exercises|
+--------+ +------------| +---------+ +---------+ +---------+
| id | |id | |id | |id | |id |
| title | |course_id | |title | |title | |title |
+--------+ |content_type| |body | |asset_id | |body |
|content_id | |copyright| |copyright| |url |
|position | +---------+ +---------+ +---------+
+------------+
The refactored schema is far better, the content types tables can be individually maintained without knock-on effects. From a data integrity point of view this means that the correct constraints can be added to each table, something which was problematic before. From an ActiveRecord point of view, adding columns to a table for one of the content types will not increase the surface area of the API for the other content types.
The refactored schema uses a join table and a polymorphic relationship to link courses and content types. However, reflecting this change in the ActiveRecord models results in a wide ranging API change. Every call to an attribute of a content type need to be changed from e.g. step.body
to step.content.body
.
This pattern of linking tables via a polymorphic join table is called Class Table Inheritance. Unfortunately ActiveRecord doesn’t support this pattern out of the box.
In the e-learning application I have been working on, we have a very similar database schema to the refactored one shown above. However, I have had the luxury of being able to use Sequel rather than ActiveRecord and so I’ve been able to take advantage of using class table inheritance.
Class table inheritance in Sequel
For the past few years I’ve been using Sequel in all my Rails projects. More often than not I’ll use Sequel Model as a straight replacement to ActiveRecord. Yes, the models still get coupled to the database schema in the same way that they do when using ActiveRecord, but I find Sequel gives me much more flexibility when it comes to database work.
So, let’s look how you could approach the FutureLearn example using class table inheritance in Sequel.
The schema can be created with the following migration:
Sequel.migration do
change do
create_table :courses do
primary_key :id
String :title
end
create_table :steps do
primary_key :id
Integer :course_id, null: false
Integer :position, null: false
String :content_type, null: false
end
create_table :articles do
foreign_key :id, :steps
String :title
String :body
String :copyright
end
create_table :videos do
foreign_key :id, :steps
String :title
Integer :asset_id
String :copyright
end
create_table :exercises do
foreign_key :id, :steps
String :title
String :body
String :url
end
end
end
The only difference to the FutureLearn schema is that the steps.content_id
column is no longer needed, this is because we have set up foreign key relationships between the steps
table and the article
, video
and exercise
tables. The id of row in the steps
table will always be the same as the id of the row in the content table it links to.
Using the class table inheritance plugin that ships with Sequel, our models can be defined as follows:
class Course < Sequel::Model
one_to_many :steps, order: :position
end
class Step < Sequel::Model
plugin :class_table_inheritance, key: :content_type
many_to_one :course
end
class Articles < Step; end
class Videos < Step; end
class Exercises < Step; end
After setting this up and populating some data, we can work with the models in a way that will be familiar:
Step.all # [<#Article>, <#Video>, <#Exercise>, <#Article>, ...]
Notice here that our objects have been correctly typecast for us. We asked for the all the steps and got back a collection of articles, videos and exercises.
Similarly, this works when accessing the steps through a relationship:
course = Course.first
course.steps # [<#Article>, <#Video>, <#Exercise>]
Essentially, the CTI plugin is performing the polymorphic join for us as a single step. There’s no need to access intermediary Step
objects and then ask them for their content
.
The objects are built and created as you would expect:
Video.new(course: course, position: 1, title: 'Learn X') # Unsaved video object
Video.create(course: course, position: 1, title: 'Learn Y') # Persisted video object
Note that the course
and position
are attributes of the content items themselves, there is no need to create Step
objects.
And the properties are accessed as normal:
step = course.step.first # <#Video>
step.body # "body text"
step.position # 1
Note that position
is accessed in the same way as body
even though the columns exist on different tables.
There is no need for the extra step in the API chain that was introduced in the ActiveRecord way:
# From the FutureLearn post
step.content.body # "body"
step.position # 1
Since Article
, Video
and Exercise
all have a title
attribute we could have structured our schema such that title
was a column on the steps
table rather than a column on each of the individual tables. This would have been fine with the CTI approach, the API would remain step.title
and we’d just have one column in the database that was common to all content types. Attributes that are common to all content types go in the steps
table and the individual tables only include the columns that differentiate that type.
However, with the ActiveRecord approach, adding the title
to the steps
table would have resulted in an API inconsistency (step.title
vs step.content.body
) and so it’s likely that you’d make sure all the content type tables included all the common columns.
Wrapping up
The point of this post isn’t to show how, in a different reality, if FutureLearn had been built using different tools, they could have made a different change. There’s not a whole lot of point in that.
Instead it was to highlight some of the features available in Sequel. People often ask me why I choose Sequel over ActiveRecord and my stock response is that it gives me more flexibility. When I read the FutureLearn post, I thought it was a good, concrete example that I could use to demonstrate Class Table Inheritance, a database pattern that is supported in Sequel but not in ActiveRecord.
If you’re starting a new project, I’d encourage you to take a look at using Sequel, it really is a great library.
rails new myapp --skip-active-record