OccamsRecord provides the power tools that ActiveRecord forgot. Specifically, advanced eager loading capabilities, full support for hand-written SQL, cursors, and high performance for read-only operations. Use it alongside ActiveRecord to unlock the power of your database.

Contribute to OccamsRecord's development at github.com/jhollinger/occams-record.

Full documentation is available at rubydoc.info/gems/occams-record.

Occam's Razor & Simplicity

Do not multiply entities beyond necessity. -- William of Ockham

This definition of simplicity is a core tenant of OccamsRecord. Good libraries are simple, fast, and stay out of your way.

Fast & read-only

OccamsRecord embraces simplicity by doing less. The vast majority of ActiveRecord objects are used read-only, yet each prepares and holds internal state just in case it's used for writing. By returning results as structs, OccamsRecord offers a baseline 3x-5x speed boost and 2/3 memory reduction.

No N+1 problem

OccamsRecord simply refuses to do lazy loading, aka the "N+1 query problem". If you want to use an association, eager load it up-front. While ActiveRecord now supports similar opt-in behavior, it still can't beat Occams in speed and the power of defaults.

No arbitrary limitations

OccamsRecord also embraces simplicity by making things easier for you, the person writing code. ActiveRecord has poor support for hand-written SQL, advanced eager loading scenarios, and advanced database features like cursors. OccamsRecord makes everything simpler by providing all of this in a way that's familiar and easy to use.

Installation

Simply add the occams-record gem to your Gemfile:

gem 'occams-record'

OccamsRecord has no configuration. As a companion library to ActiveRecord, all database configuration comes from ActiveRecord.

Your First Query

Docs: query

Write your queries like normal:

q = Order.
  completed.
  where("order_date > ?", 30.days.ago).
  order("order_date DESC")

But run them with OccamsRecord:

orders = OccamsRecord.
  query(q).
  to_a

Now instead of bloated ActiveRecord objects, orders is an array of fast, small structs!

You may use any Enumerable method to run your query and iterate results:

OccamsRecord.query(q).each { |order| ... }
OccamsRecord.query(q).map { |order| ... }
OccamsRecord.query(q).reduce([]) { |acc, order| ... }

Batching

OccamsRecord provides find_each and find_in_batches methods that work similarly to their ActiveRecord counterparts.

OccamsRecord.query(q).find_each { |order|
  ...
}

OccamsRecord.query(q).find_in_batches { |orders|
  orders.each { |order|
    ...
  }
}

Using PostgreSQL? Consider using find_each_with_cursor or find_in_batches_with_cursor for a performance boost. See Cursors for more info.

Eager Loading

Docs: eager_load

OccamsRecord's eager_load method is similar to ActiveRecord's preload (i.e. it uses a separate query instead of a join).

OccamsRecord.
  query(q).
  eager_load(:customer).
  eager_load(:line_items).
  find_each { |order|
    puts order.customer.first_name
    puts order.line_items[0].cost
  }

Nested eager loading is done with blocks. Isn't it so much more readable?

OccamsRecord.
  query(q).
  eager_load(:customer).
  eager_load(:line_items) {
    eager_load(:product)
    eager_load(:something_else) {
      eager_load(:yet_another_thing)
    }
  }.
  find_each { |order|
    puts order.customer.first_name
    order.line_items.each { |i|
      puts i.product.name
      puts i.something_else.yet_another_thing.description
    }
  }

There's lots more that eager_load can do. We'll cover it in Advanced Eager Loading.

Advanced Eager Loading

Docs: eager_load

Select just the columns you need

Pulling back only the columns you need can be noticeably faster and use less memory, especially for wide tables.

OccamsRecord.
  query(q).
  eager_load(:customer, select: "id, name")

Fully customize the query

You can snag the eager load's query and customize it using your model's scopes or query builder methods (select, where, joins, order, etc).

OccamsRecord.
  query(q).
  eager_load(:customer, ->(q) { q.active.order(:name) })

There's a block-based syntax that's easier to read for long queries:

OccamsRecord.
  query(q).
  eager_load(:customer) {
    scope { |q|
      q.active.
        joins(:account).
        where("accounts.something = ?", true).
        select("customers.id, customers.name")
    }
  }

Block-argument syntax

If you need to call methods from the surrounding environment, like params in a Rails controller, use the block-argument syntax.

OccamsRecord.
  query(q).
  eager_load(:customer) { |c|
    c.scope { |q| q.where(some_column: params[:some_value]) }

    c.eager_load(:account) { |a|
      a.eager_load(:something_else)
    }
  }

Ad Hoc Associations

Docs: eager_load_many, eager_load_one

On rare occasions you may need to eager load an association that doesn't actually exist in your models. Maybe it's too convoluted to represent with ActiveRecord. Or maybe it's just deeply nested and you don't want to waste time/memory loading all the intermediate records.

eager_load_many

The following example uses eager_load_many to load a non-existent, has-many association on Product called customers. Each product will have a customers attribute that contains the customers who bought the product.

OccamsRecord.
  query(Product.all).
  eager_load_many(:customers, {:id => :product_id}, "
    SELECT DISTINCT product_id, customers.*
    FROM line_items
      INNER JOIN orders ON line_items.order_id = orders.id
      INNER JOIN customers on orders.customer_id = customers.id
    WHERE
      line_items.product_id IN (:ids)
      AND customers.created_at >= :date
  ", binds: {
    date: params[:date]
  })

That's a lot, so we'll break it down. The method call really just looks like this:

eager_load_many(:customers, {:id => :product_id}, "SOME SQL", binds: {date: some_date})

The first argument, :customers, simply gives this made-up association a name. We'll call product.customers to get a product's customers.

The second argument, {:id => :product_id} defines the parent-child mapping. In this case it says, "The parent product records have an id field, and it will match the product_id field in the child customers."

The third argument is the SQL that loads customers. Notice the line_items.product_id IN (:ids) section. That's ensuring we're only loading customers that are related to the products we've loaded. OccamsRecord will provide those ids for us - don't worry. (And it's only called :ids because we defined the parent mapping as :id. If the parent mapping was instead :code, we'd put :codes in the SQL.)

The forth argument is optional. It can be a Hash or Array of any other query parameters you need.

eager_load_one

eager_load_one works exactly the same but for one-to-one relationships.

Nesting ad hoc associations

Like other eager loads, you can nest ad hoc ones. Here's an eager_load_many with an eager_load_one nested inside:

OccamsRecord.
  query(Product.all).
  eager_load_many(:customers, {:id => :product_id}, "SELECT...") {
    eager_load_one(:something, {:id => :customer_id}, "SELECT...")
  }

Here's an eager_load_many with a regular eager_load nested!

OccamsRecord.
  query(Product.all).
  eager_load_many(:customers, {:id => :product_id}, "SELECT...", model: Customer) {
    eager_load(:profile)
  }

Notice that we added model: Customer to eager_load_many's arguments. That annotates the ad hoc association with the model, allowing us to use the regular eager_load on Customer associations like :profile.

Hand-written SQL

Docs: sql

Sometimes you have to write a big, gnarly SQL query by hand. Here's an example using Common Table Expressions (CTE).

OccamsRecord.
  sql("
    WITH regional_sales AS (
      SELECT region, SUM(amount) AS total_sales
      FROM orders
      GROUP BY region
    ), top_regions AS (
      SELECT region
      FROM regional_sales
      WHERE total_sales > :min_sales
    )
    SELECT
      region,
      product,
      SUM(quantity) AS product_units,
      SUM(amount) AS product_sales
    FROM orders
    WHERE region IN (:regions)
    GROUP BY region, product;
  ", {
    min_sales: 10_000,
    regions: ["A", "B", "C"],
  }).
  each { |order|
     ...
  }

OccamsRecord supports several query param syntaxes:

# Rails-style
OccamsRecord.sql("SELECT ... WHERE orders.date > :date", {date: date})
OccamsRecord.sql("SELECT ... WHERE orders.date > ?", [date])

# Ruby-style
OccamsRecord.sql("SELECT ... WHERE orders.date > %{date}", {date: date})
OccamsRecord.sql("SELECT ... WHERE orders.date > %s", [date])

Eager Loading

Docs: sql, eager_load, eager_load_many, eager_load_one

Unlike ActiveRecord, OccamsRecord lets you eager load associations when using hand-written SQL. There are two ways to do it.

Using a model

If your results are close enough to a model, you can annotate the query with the model and eager_load its associations.

OccamsRecord.
  sql("
    SELECT * FROM orders
    WHERE order_date > :date
    ORDER BY order_date DESC, id
  ", {
    date: 30.days.ago
  }).
  model(Order).
  eager_load(:customer) {
    eager_load(:profile)
  }.
  each { |order|
    puts order.customer.profile.username
  }

This works because the query is returning orders.customer_id, and that's the foreign key for the Order#customer relationship.

Using ad hoc associations

If your results don't resemble a model, or you need to load associations from various models, you can write the SQL yourself in an "ad hoc association". See Ad Hoc Associations for more details.

OccamsRecord.
  sql("
    SELECT * FROM orders
    WHERE order_date > :date
    ORDER BY order_date DESC, id
  ", {
    date: 30.days.ago
  }).
  eager_load_one(:customer, {:customer_id => :id}, "
    SELECT * FROM customers
    WHERE id IN (:customer_ids)
  ").
  each { |order|
    puts order.customer.name
  }

This will take the customer_id column from the parent query and match it to the id column in the eager load query.

Batched Loading

Docs: sql, find_each, find_in_batches, find_each_with_cursor, find_in_batches_with_cursor

Unlike ActiveRecord, OccamsRecord lets you use batched loading with hand-written SQL. There are two ways to do it.

Cursor based

If you're using PostgreSQL, using cursors for batched loading is faster and easy:

OccamsRecord.
  sql("
    SELECT * FROM orders
    WHERE order_date > :date
    ORDER BY order_date DESC, id
  ", {
    date: 10.years.ago
  }).
  find_each_with_cursor(batch_size: 1000) { |order|
    ...
  }

Read more about using Cursors in OccamsRecord.

OFFSET & LIMIT based

With other databases you'll need to fall back to the traditional (and potentially slower) OFFSET & LIMIT approach.

OccamsRecord.
  sql("
    SELECT * FROM orders
    WHERE order_date > :date
    ORDER BY order_date DESC, id
    LIMIT :batch_limit
    OFFSET :batch_offset
  ", {
    date: 10.years.ago
  }).
  find_each(batch_size: 1000) { |order|
    ...
  }

OccamsRecord will provide the values for :batch_limit and :batch_offset. Just put the references in the right place.

Cursors

Docs: find_each_with_cursor, find_in_batches_with_cursor, cursor

Note: This section is only relevant to applications using PostgreSQL.

For batched loading, cursors perform better than the traditional OFFSET & LIMIT approach. If you're using PostgreSQL, take advantage of them with find_each_with_cursor and find_in_batches_with_cursor.

OccamsRecord.
  query(q).
  eager_load(:customer).
  find_each_with_cursor { |order|
    ...
  }
OccamsRecord.
  query(q).
  eager_load(:customer).
  find_in_batches_with_cursor { |orders|
    orders.each { |order| ... }
  }

If you need custom logic when using your cursor, use the lower-level cursor method:

OccamsRecord.
  query(q).
  eager_load(:customer).
  cursor.
  open { |cursor|
    cursor.move(:forward, 300)
    orders = cursor.fetch(:forward, 100)
    orders.each { |order| ... }
  }

The cursor var is an instance of OccamsRecord::Cursor.

Cursors with hand-written SQL

Using cursors with hand-written SQL is a breeze with find_each_with_cursor and find_in_batches_with_cursor.

OccamsRecord.
  sql("
    SELECT * FROM orders
    WHERE order_date > :date
    ORDER BY order_date DESC, id
  ", {
    date: 10.years.ago
  }).
  find_each_with_cursor(batch_size: 1000) { |order|
    ...
  }

And you still have low-level access via cursor:

OccamsRecord.
  sql("
    SELECT * FROM orders
    WHERE order_date > :date
    ORDER BY order_date DESC, id
  ", {
    date: 10.years.ago
  }).
  cursor.
  open { |cursor|
    cursor.move(:forward, 300)
    orders = cursor.fetch(:forward, 100)
    orders.each { |order| ... }
  }

Instance Methods

OccamsRecords results are just plain structs; they don't have methods from your Rails models. (Separating your persistence layer from your domain is good thing!) But sometimes you need a few methods. OccamsRecord provides two ways of accomplishing this.

Injecting modules

You may also specify one or more modules to be included in your results:

module MyOrderMethods
  def description
    "#{order_number} - #{date}"
  end
end

module MyProductMethods
  def expensive?
    price > 100
  end
end

orders = OccamsRecord.
  query(Order.all, use: MyOrderMethods).
  eager_load(:line_items) {
    eager_load(:product, use: [MyProductMethods, OtherMethods])
  }.
  find_each { |order|
    puts order.description # MyOrderMethods#description
    puts order.line_items[0].product.expensive? # MyProductMethods#expensive?
  }

ActiveRecord fallback mode

This is an ugly hack of last resort if you can’t easily extract a method from your model into a shared module. Plugins, like carrierwave, are a good example. When you call a method that doesn’t exist on an OccamsRecord result, it will initialize an ActiveRecord object and forward the method call to it.

The active_record_fallback option must be passed either :lazy or :strict (recommended). :strict enables ActiveRecord’s strict loading option, helping you avoid N+1 queries in your model code. :lazy allows them. (:strict is only available for ActiveRecord 6.1 and later.)

The following will forward any nonexistent methods for Order and Product records:

orders = OccamsRecord.
  query(Order.all, active_record_fallback: :strict).
  eager_load(:line_items) {
    eager_load(:product, active_record_fallback: :strict)
  }.
  run

Exceptions

Aside from various possible RuntimeError and ArgumentError exceptions, OccamsRecord has several well-defined exceptions for common errors. The most useful are described here.

OccamsRecord::MissingEagerLoadError

As mentioned in the introduction, OccamsRecord won't lazy load any associations for you. If you forget to eager load one and try to use it, it will throw an OccamsRecord::MissingEagerLoadError exception.

OccamsRecord.
  query(q).
  eager_load(:line_items) {
    eager_load(:product)
  }.
  find_each { |order|
    # this throws because it tries to access "category", which we didn't eager load
    puts order.line_items[0].product.category.name
  }

The message contains helpful information telling us exactly where we forgot to eager load it:

Association 'category' is unavailable on Product because it was not eager loaded! Occams Record trace: root.line_items.product

OccamsRecord::MissingColumnError

Elsewhere we noted that your eager loads can specify a subset of columns to select (for performance reasons). If you try to access a column you didn't select, it will throw an OccamsRecord::MissingColumnError exception.

OccamsRecord.
  query(q).
  eager_load(:line_items) {
    eager_load(:product, select: "id, name")
  }.
  find_each { |order|
    # this throws because it tries to access the "description" column, which we didn't select
    puts order.line_items[0].product.description
  }

The message contains helpful information telling us exactly where we forgot to select it:

Column 'description' is unavailable on Product because it was not included in the SELECT statement! Occams Record trace: root.line_items.product"