Immutability in Ruby Part 2: Domain Models

| Comments

What does the concept immutable data mean from a Ruby programmer’s perspective? How is immutability supported in Ruby, and why should you care?

This is the second part of the two part article based on a little talk I did at the March 2013 meetup of the Helsinki Ruby Brigade.

Entities and Values

We have looked at basic data structures and the effects of in-place state mutation there. I hope to have convinced you that there are very valid reasons for favoring immutable data structures over mutable ones.

But is there a place for immutability in our domain models? When does making an object immutable make sense in business logic code?

Let’s look at what some of the well-known voices in the object-oriented community have to say. In particular, in what kinds of contexts do they talk about immutable objects?

Eric Evans: Entities and Value Objects

In the Eric Evans book Domain-Driven Design (2003), and in the DDD community it sparked, there is a clear distinction between two different kinds of concepts in our domain models: Entities and Value Objects.

An object defined primarily by its identity is called an ENTITY.

An object that represents a descriptive aspect of the domain with no conceptual identity is called a VALUE OBJECT.

There are things called Entities and there are things called Value Objects. Entities have something called an Identity, whereas Value Objects don’t. We will return to these concepts shortly.

But what about immutability?

As long as a VALUE OBJECT is immutable, change management is simple - there isn’t any change except full replacement. Immutable objects can be freely shared.

Immutability is a great simplifier in an implementation, making sharing and reference passing safe. It is also consistent with the meaning of a value. If the value of an attribute changes, you use a different VALUE OBJECT, rather than modifying the existing one.

It sounds like Evans thinks of immutability as a good idea when implementing Value Objects - both because it fits conceptually, and because it simplifies implementation.

So Evans talks about entities, values, identities, and immutability. But is this just a Domain-Driven Design thing? Apparently not:

Steve Freeman and Nat Pryce: Values and Objects

The book Growing Object-Oriented Software, Guided by Tests (2009, often lovingly referred to as GOOS) could be described as one of the primary sources of modern OO design wisdom. In it, authors Steve Freeman and Nat Pryce are proposing a distinction very similar to what Evans proposes in DDD:

When designing a system, it’s important to distinguish between values […] and objects.

Values are immutable instances that model fixed quantities. They have no individual identity, so two value instances are effectively the same if they have the same state. Objects, on the other hand, use mutable state to model their behavior over time.

In practice, this means we split our system into two “worlds”: values, which are treated functionally, and objects, which implement the stateful behavior of the system.

A similar, but slightly different, terminology is repeated here: Immutable values, objects, identities. The central point is also the same as with Evans: There are two different categories of things in our domains, and we should clearly distinguish between them.

Rich Hickey: Identities and Values

Finally, let’s take a look at Rich Hickey’s thinking on the subject:

People accustomed to OO conceive of their programs as mutating the values of objects. They understand the true notion of a value, say, 42, as something that would never change, but usually don’t extend that notion of value to their object’s state.

That is a failure of their programming language. These programming languages use the same constructs for modeling values as they do for identities, and default to mutability, causing all but the most disciplines programmers to create many more identities than they should, creating identities out of things that should be values.

Again we have the same terminology: Mutation, objects, values, identities. Hickey also makes the distinction between things that have an Identity and things that don’t.

So what exactly is the common thread in these three examples?

Two Concepts

The classes in our domains can be divided into two different groups: Values and Objects.

Values are things that are defined in terms of their contents (or “state”) - like the number 42. Values are immutable: We can’t change 42 to be something else, as we discussed in the previous post.

Objects are things that have an Identity, and may have different values over time. For example, a User Object might have a name field, which may change over time. When a person gets married, their last name may change but to us they are still the same user, just with a mutated name. The user has an identity that persists over time - typically modeled as an id field corresponding to a primary key in an underlying database.

The distinction is not just between primitive or simple values and compound objects. A compound object may also be a value. For example, consider an Address class, with fields for a street address, a zip code, and a city. Is it an Object or a Value? Well, in most cases it should probably be a Value. Two addresses on the same street are two different addresses, you don’t get one from the other by mutating the street address field. This is in contrast to the user whose last name may change - it is definitely still conceptually the same user.

Notice how this is exactly the same reasoning as we did with 42. An address is no less of a Value than 42, it’s just that we’re more accustomed to thinking of numbers as Values than we are of the types we define ourselves.

Objects Values
A.K.A Entities (in DDD) A.K.A. Value Objects (in DDD)
Typically mutable Immutable
Have an identity (e.g. “id”). No separate identity, the value itself is the identity.
User
Account
Company
Numbers: 42
Strings: "abc" *
Dates: 2013-03-28 **
Address
AccountNumber

*) Though not immutable in Ruby.
**) Though not immutable in Java before JDK 8.

Aside: Clojure

This is not an article about Clojure, but I would still like to point out that Clojure is one of the very few languages that models Values and Identities as explicitly separate concepts.

Even if you don’t plan to become a Clojure programmer, I recommend taking a look at how Clojure does this.

Writing Clojure programs has clarified my thinking on what this distinction actually is, much more than just reading about it has done.

Entities, Values, and Immutability in Ruby

As a language agnostic concept, there are definitely two different building blocks in our domain models, as we have just seen. But let’s turn the discussion back to Ruby. How does the Object/Value distinction look like in code?

Take a look at the following class:

user.rb
1
2
3
4
5
class User
  attr_accessor :id,
                :name,
                :address
end

Is it an Object or a Value? Well, obviously it is an Object. A User is something that persists over time, and something whose individual fields may change over time. It even has an id, which is a concrete representation of the user’s identity.

Now, how about this:

address.rb
1
2
3
4
5
class Address
  attr_accessor :street,
                :zip,
                :city
end

Is that an Object or a Value? Arguably this one is a Value. As we discussed, an address is an example of something that doesn’t change over time. Different addresses should be different Address objects. When you move to a new location, you don’t fiddle with the street field of your address, you have a new Address.

The problem is that between User and Address, there really isn’t any difference in the code. There are no separate Object and Value constructs in the Ruby language - there are just classes. (To be fair, this is the case in all major OO languages.)

One way to make the distinction more clear would be to make the Address class immutable, as all values should be. So how can we do that?

Accessors

First of all, we need to get rid of the writers for those attributes. They should only have readers.

When we remove the writers, we must also define a constructor for the class, so we can define the initial values for the fields:

address.rb
1
2
3
4
5
6
7
8
9
10
11
class Address
  attr_reader :street,
              :zip,
              :city

  def initialize(street, zip, city)
    @street = street
    @zip = zip
    @city = city
  end
end

That’s better. But is Address an immutable class now?

Freezing

We can easily think of a situation where some new method introduced to Address changes one of those fields - internal methods don’t need the accessors since they can just access the fields directly. There’s also the possibility of accessing the instance variables externally via something like instance_variable_set.

What we need to do is to freeze the Address. Freeze is a method that exists in all Ruby objects, and its purpose is to say “after being frozen, none of the instance variables of this object can be changed. If you try to do that, I will throw an exception.”

If we freeze the address right in the constructor, that should take care of it. None of the instance variables can be changed after construction:

address.rb
1
2
3
4
5
6
7
8
9
10
11
12
class Address
  attr_reader :street,
              :zip,
              :city

  def initialize(street, zip, city)
    @street = street
    @zip = zip
    @city = city
    freeze
  end
end

Are we done?

Defensive Copying

At this point we’re getting into interesting territory. On the face of it, Address is immutable since we can’t change any of its fields. But the thing is, one or more of the objects stored in those fields might not be immutable themselves. For example, street might be a String, which in Ruby is mutable, as we have discussed. Someone can just get the String stored in street and change all of its contents. Our efforts for immutability so far do not help us prevent that.

Immutability is a transitive property in this sense. If anything within an object graph is mutable, the whole object graph is mutable. You have to go all the way to be able to brag about your object’s immutability.

One way to make sure your object doesn’t “leak” mutable members is to do defensive copying in readers. In practice, when someone asks for one of the attributes, we always return a copy instead of the original one. The caller can do with the copy what they please and we won’t be affected by it:

address.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class Address

  def initialize(street, zip, city)
    @street = street
    @zip = zip
    @city = city
    freeze
  end

  def street
    @street.dup
  end

  def zip
    @zip.dup
  end

  def city
    @city.dup
  end
end

This is slightly verbose, although it’s the kind of verbosity that could easily be eliminated with some metaprogramming.

If you decide to do this, note that dup does a shallow copy of the object, so if one of its members is mutable, you still have a problem.

Deep Freezing

Another way to make sure that everything within the object graph is indeed immutable is to deep freeze it. You can achieve this by calling freeze not only on the object iself, but also on all of its members, and their members, and so on.

One could do this manually, but there are also libraries for it. One of them is called ice_nine, and it comes with a class method that takes one object: An object to deep freeze.

address.rb
1
2
3
4
5
6
7
8
9
10
11
12
class Address
  attr_reader :street,
              :zip,
              :city

  def initialize(street, zip, city)
    @street = street
    @zip = zip
    @city = city
    IceNine.deep_freeze(self)
  end
end

With that, we can be fairly sure that we have an immutable object. (Although, in Ruby, there always seems to be a way to get around any restriction, so YMMV. One would have to go to some lengths to mutate this object.)

Equality and Comparisons

Update 2013-03-29: This point about equality came up on Reddit. I think it deserves to be mentioned.

Since values don’t have an identity, and two Values with the same value should be considered equal, we need to override the default implementation of == in all of our Value classes. The default implementation is based on the object id, so two different Value objects with exactly the same contents are not considered equal, which is wrong.

address.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Address
  attr_reader :street,
              :zip,
              :city

  def initialize(street, zip, city)
    @street = street
    @zip = zip
    @city = city
  end

  def ==(other)
    self.street == other.street &&
    self.zip == other.zip &&
    self.city == other.city
  end

end

Our implementation of == assumes address equality will never be tested against non-addresses (thing without methods for street, zip, and city). If you can’t make that assumption, you’ll additionally need type checks of respond_to? checks in the method.

If your values are actually comparable (i.e. one is considered larger than the other), such as something like an Area might be, consider including the Comparable module and implementing the <=> method to gain a full set of comparison operators for your value.

See this classic blog post by Alan Skorkin for a good discussions on equality and comparison.

Gems

The problem with an immutable Value class as we have defined above is that it still doesn’t clearly communicate that it’s a Value. Ideally, one could see at a glance what kind of construct a class is defining. With the implementations we have seen, the reader must infer that information by noticing that it’s immutable, doesn’t have an id, and other such characteristics of a Value.

Fortunately, there are some gems available for this as well. They provide libraries for explicitly defining a class as a Value. One of these gems is called Virtus. With Virtus, you can define a Value by including a module and then declaring the attributes. With that information, Virtus will define a constructor, readers for the attributes, and a value based implementation of ==.

address.rb
1
2
3
4
5
6
7
class Address
  include Virtus::ValueObject

  attribute :street, String
  attribute :zip,    String
  attribute :city,   String
end

Another gem that may be useful is called simply Values. It defines a construct very similar to the built-in Struct, the difference being that a Value is immutable and a Struct is not:

address.rb
1
Address = Value.new(:street, :zip, :city)

ActiveRecord et al.

We’ve discussed Objects, Values, and the distinction between them. For most of us though, there are some practical issues to deal with when building our domain models. Perhaps the biggest of those has to do with ORM libraries. In Ruby that usually means ActiveRecord or one of its ActiveModel brethren.

The problem with ORMs is that while they provide a very convenient way to interface with relational databases, they also restrict what we can do in our object models to a subset that’s “database friendly”. When we try to model our business logic within ORM derived classes, we start to feel these restrictions pretty quickly.

This usually isn’t a problem in small applications, and indeed it was one of the major advantages of Rails early on - you didn’t have to add so many layers in your application to get stuff done.

Values Within ActiveRecord: composed_of and workarounds

It is possible to model immutable values with ActiveRecord, but it is slightly awkward.

For a long time, you could use the composed_of construct to define the fact that “this bunch of attributes should actually be grouped into that separate class.” That separate class was a true, immutable Value.

In Rails 4, composed_of is removed because it hasn’t been used much and it is expensive for the Rails core team to maintain. This is unfortunate but understandable. In a blog post about this issue, José Valim has outlined a way you can do something similar manually:

user.rb
1
2
3
4
5
6
7
8
9
10
11
class User < ActiveRecord::Base
  def address
    @address ||= Address.new(street, zip, city)
  end
  def address=(address)
    self[:street] = address.street
    self[:zip] = address.zip
    self[:city] = address.city
    @address = address
  end
end

Basically, the street, zip, and city fields are part of the User class and the corresponding database table, but they can also be wrapped into the Address Value. An Address is constructed on demand, and serialized back to the three fields when set. This is a completely acceptable, though slightly verbose solution for integrating Values into ActiveRecord models.

Architecting For OO Domain Models

It seems that lately more people in the Rails community have started architecting their way around these ORMs and other limiting frameworks, by completely separating them from the core domain model.

One of these architectural models is called the Hexagonal Architecture. Its main idea is to put the domain model right in the center of the system, and organize any external interfaces - such as databases or web controllers - around it.

The domain model is free of any framework code, so you have the full power of OO in your fingertips there. That includes the possibility to distinguish between Objects and Values without anything getting in your way.

Hexagonal Architecture

For more information about hexagonal architectures, see Alistair Cockburn’s original material, the GOOS book, Uncle Bob’s article on clean architectures, and Matt Wynne’s talk on Hexagonal Rails.

Summary

In our domain models, we should clearly distinguish between Objects and Values. This has been a guideline of good object-oriented design for a long time, surfacing in different contexts with slightly different terminology.

Though often only simple values (like 42) are considered true values, the concept applies to compound things (like addresses) as well. The distinction is natural once you start thinking about it.

Our programming languages don’t help us much in making this distinction, but there are things we can do. Enforcing immutability is one of those, and we have seen how to do it in Ruby.

Frameworks can be an impediment for good OO design - ORM frameworks like ActiveRecord especially so - and we can either work around their limitations or totally separate our domain models from them by architecting our applications carefully.

Comments