Learning ActiveRecord

ActiveRecord

The ARQI, which was previously known as Arel, is a special kind of query mapper. In short, you use the active record API to chain together ideas that are translated into SQL. To be effective, you should learn some SQL first. While the basics of SQL are the same for different flavors of SQL (databases), some databases have small differences. The Query interface tries to eliminate those differences to the Rails developers so that you don’t have to think about what database is being used on the backend.
Meet the databases:

Databases

PostgreSQL

pronounced “Postgres,” this is arguably the most popular database on the internet. For learning, it makes a cheap and fast tool that you can quickly get set up with. It boasts some features over its primary competitor, MySQL

MySQL 

MySQL is the rock-solid workhorse of the internet, powering massive data distributions and, arguably, the other most popular database server on the internet. Hello, meet MySQL.

MySQL is fast and performant and is extremely stable, but it is memory-heavy (that means it uses a lot of memory in the cloud), which makes it expensive to run. It also has sophisticated sharding which applies to you if you are a very massive website.

Oracle

Orcale is an enterprise database provided by Oracle Corporation. Oracle is typically found in larger, corporate environments.

SQLite 

A database you use only for development. This database keeps its data written on a disk file in your Rails app.

MS SQL

This is Microsoft SQL, Microsoft’s flavor of SQL. Sharing many similarities to MySQL, you will find developers use this after they outgrow Microsoft Access. MS SQL can be hosted on Azure, Microsoft’s cloud platform, and is hosted by many other providers as well.

Sybase

Is another enterprise database. You will note that the company “Sybase” was acquired by SAP and is now called SAP, but the database Sybase lives on and is one of the databases supported by the Rails ORM.

Ok with that overview of what database choices we might have let’s get into what ARQI actually is.

ActiveRecord Query Interface

The methods that you use to query from your base object, or Active Record root object. Each method will create or modify what will eventually be the complete SQL query. When you create the ARQI object, the idea of a query exists in memory as — a query that has yet to be executed. That is, it has not actually run the SQL statement. against the database. It does this to allow you to chain one ARQI-object, or Arel, onto another, thus combining SQL statements.

You will “chain” these methods together, that is, use them one after another with a dot. We call this “dot notation” and it works because instead of executing your SQL immediately, the ActiveRecord relation returns a chainable object.

When you create the ARQI object, the idea of a query exists in memory as — a query that has yet to be executed.

ActiveRecord Methods

find

Use find when you know the id of the thing you are searching for. However, when creating new things, you want to use the magic Rails methods

create_with

Ffor preventing race conditions within the creation action. Use create_with(locked: false) to tell Active Record to run the query in a locked transaction. If successful, it returns an unlocked AR object.

distinct

Here we can pass a distinct flag to the query which will make Active Record return only a single record per unique value.

eager_load

Here we tell Active Record to eagerly load the association specified. This eager loading by performing a LEFT OUTER JOIN. As discussed in eager_loads’s cousins includes and preload, Rails has two ways of dealing with N+1s: 1) “big join” and 2) “separate queries.” Eager load tells Rails to always use strategy #1 “the big join.”

extending

extract_associated

from

If you want to specify a non-standard Rails table name.

group

Allows you to pass ‘GROUP BY’ to your SQL to group records with a matching result together.

having

Allows you to specify a HAVING clause in your SQL. You will need a .group as well.

includes

This specifies the relationships that ActiveRecord will find in the resultset. This is useful for when we will need these relationships during view or JSON rendering. This is necessary to avoid N+1 queries. See the post Who Is N+1? for a discussion of N+1 queries.

Rails has two ways of dealing with N+1s: 1) “big join” and 2) “separate queries.” (Between the three choices preload, joins, and includes, typically prefer includes to start out with then try the other options if you are trying to optimize a big query)

joins

Here we tell Rails to use a join query, which is essentially an inner join. That will return results where Table A and Table B have non-nill keys for one another only. You use this if you are just filtering results – not accessing records from a relationship – joins is what you want. joins does not load data from the relationship into memory: accessing columns from the relationship will trigger N+1 queries.

left_outer_joins, left_joins (alias)

Here we perform a LEFT OUTER JOIN on the data — what you commonly call a LEFT JOIN (showing the “left” tables, which in ActiveRecord is the base of the object we started with).

Left Outer join Venn Diagram

I hope you’ve enjoyed this brief introduction to Active Record. The most important thing to remember is the “observer-effect” — the Arel object does not actually execute the SQL and “become” an array (really, it is an array-like Arel object), until you do something to observe it— like call, .each. .collect,you loop through it, you examine it, etc. Sometimes you want to invoke the AR chain, in which case you should use .all to tell Active Record to invoke the chain and turn it into an array-like object.

ActiveRecord Query Interface is an important tool to use object-relational mapping (ORM) to marry your ActiveRecord object to the SQL that will drive the queries against your database. Continuing from yesterday’s post about ActiveRecord Query Interface, I’ll pick up where I left off.

limit(row_count)

This is is how you restrict records to show only a block of records. You use limit to specify the number of records to show and offset to specify the number of records to skip. You will note that in MySQL, this translates to

LIMIT [offset_count], row_count

(Remember [] denotes an optional parameter, that is, if we pass only one parameter, it will be assumed to be the row_count, not the offset.)

After that, the value passed to limit is the row_count. You should combine limit with offset if you want something other than the inital offset (which is 0), see below.

none

This is useful when you have a method that wants to return a chainable relation to another method, but this method needs to specify “none” or “no records.”

offset(offset_count)

This is limit’s cousin: the offset. This is the number of records to skip before starting to return the row_count (specified in the limit). For example, in MySQL, this translates to:

LIMIT offset, row_count

To combine limit & offset, use .limit(10).offset(20). This produces

LIMIT 10, 20

Above all, this tells MySQL to pull 10 records starting at the 21st record returned by the query. (We skip, or offset, the first 20 records.)

order

This is how you sort your query. Order takes a symbol or a block. If you pass it a symbol, it will order by that field name, ascending. On the other hand, if you pass the symbol as a hash, like so, you can specify a reverse sort (or ‘descending’)

.order(created_at: :desc)

ORDER BY created_at1

Similarly, with the :desc flag,

ORDER BY created_at DESC



preload

Here we tell rails to load side relationships. Rails has two ways of dealing with N+1s: 1) “big join” and 2) “separate queries.” Unlike .includes (which uses either of the strategies) .preload loads the association data using strategy #2 separate query. You might do this because you started with .includes and you later determine via measurement that the big join strategy is slower than the separate queries strategy so you want to force your app to use strategy #2.

readonly

Use this to tell Rails to create ActiveRecord objects that cannot be written to. In other words, this will be a read-only object.

reorder

Use this to override the default scope on the model. Therefore, whatever default order is specified, Active Record will now ignore it and use the newly specified scope.

reselect

Here the magic of Active Relations exposes itself. Let’s say you define a relation, like

people = Person.select(:age, :name)

However, some logic in your app determines that you need different fields. You can use reselect to override the existing select set on your AR chain. To be clear, this will mean your existing select will not be used.

reverse_order

Use .reverse_order to take the current sort order (or default order) and reverse it.

select(:field1, :filed2, :field3)

This is where you can specify which fields to “hydrate” the object with. You would only use this to limit the default behavior which is to hydrate (or load) all of the fields that ActiveRecord finds.

where

Here you can specify a condition, like ["age > ?", min_age]
Notice that where typically takes an array, and the question marks (?) are replaced with the variables you see in positions #2, #3, etc of that array you pass.

In conclusion, Active Record is a sharp tool to construct optimal SQL, but you need to know both sides of the equation.

I hope you’ve enjoyed this brief overview of ActiveRecord.