ruby

Inside Enumeration in Ruby

Jeff Kreeftmeijer

Jeff Kreeftmeijer on

Inside Enumeration in Ruby

Welcome back to another edition of Ruby Magic! A year ago, we learned about Ruby’s Enumerable module, which provides the methods you use when working with enumerable objects like arrays, ranges, and hashes.

Back then, we created a LinkedList class to show how to make an object enumerable by implementing the #each method on it. By including the Enumerable module, we were able to call methods like #count, #map and #select on any linked list without having to implement them ourselves.

We've learned how to use enumerables, but how do they work? Part of the magic in enumerables in Ruby comes from their internal implementation, which is all based on the single #each method, and even allows chaining enumerators.

Today, we’ll learn how the methods in the Enumerable class are implemented and how Enumerator objects allow chaining enumeration methods.

As you've become accustomed to, we'll dive in deep by implementing our own versions of the Enumerable module and Enumerator class. So, put on your over-engineering helmet and let's go!

Linked Lists

Before we begin, let’s start with a new version of the linked list class we wrote previously.

1class LinkedList
2  def initialize(head = nil, *rest)
3    @head = head
4
5    if rest.first.is_a?(LinkedList)
6      @tail = rest.first
7    elsif rest.any?
8      @tail = LinkedList.new(*rest)
9    end
10  end
11
12  def <<(head)
13    @head ? LinkedList.new(head, self) : LinkedList.new(head)
14  end
15
16  def inspect
17    [@head, @tail].compact
18  end
19
20  def each(&block)
21    yield @head if @head
22    @tail.each(&block) if @tail
23  end
24end

Unlike the previous version, this implementation allows empty lists to be created, as well as lists with more than two items. This version also allows passing a linked list as the tail when initializing another.

1irb> LinkedList.new
2=> []
3irb> LinkedList.new(1)
4=> [1]
5irb> LinkedList.new(1, 2)
6=> [1,[2]]
7irb> LinkedList.new(1, 2, 3)
8=> [1,[2,[3]]]
9irb> LinkedList.new(1, LinkedList.new(2, 3))
10=> [1,[2,[3]]]
11irb> LinkedList.new(1, 2, LinkedList.new(3))
12=> [1,[2,[3]]]

Previously, our LinkedLIst class included the Enumerable module. When mapping over an object using one of Enumerable's methods, the result are stored in an array. This time, we'll implement our own version to make sure our methods return new linked lists instead.

Enumerable Methods

Ruby's Enumerable module comes with enumeration methods like #map, #count, and #select. By implementing the #each method and including the Enumerable module in our class, we'd be able to use those methods directly on our linked lists.

Instead, we'll implement DIYEnumerable and import that instead of Ruby's version. This isn't something you'd typically do, but it will give us a clear insight into how enumeration works internally.

Let's start with #count. Each of the importable methods in the Enumerable class uses the #each method we implemented in our LinkedList class to loop over the object to calculate their results.

1module DIYEnumerable
2  def count
3    result = 0
4    each { |element| result += 1 }
5    result
6  end
7end

In this example, we've implemented the #count method on a new DIYEnumerable module that we’ll include in our linked list. It starts a counter at zero and calls the #each method to add one to the counter for every loop. After looping over all elements, the method returns the resulting counter.

1module DIYEnumerable
2  # ...
3
4  def map
5    result = LinkedList.new
6    each { |element| result = result << yield(element) }
7    result
8  end
9end

The #map method is implemented similarly. Instead of keeping a counter, it uses an accumulator, which starts as an empty list. We'll loop over all elements in the list, and yield the passed block on each element. The result of each yield is appended to the accumulator list.

The method returns the accumulator after looping over all of the elements in the input list.

1class LinkedList
2  include DIYEnumerable
3
4  #...
5end

After including the DIYEnumerable in our LinkedList, we can test our newly added #count and #map methods.

1irb> list = LinkedList.new(73, 12, 42)
2=> [73, [12, [42]]]
3irb> list.count
4=> 3
5irb> list.map { |element| element * 10 }
6=> [420, [120, [730]]]

Both methods work! The #count method correctly counts the items in the list, and the #map method runs a block for each item and returns an updated list.

Reversed lists

However, the #map method seems to have reverted the list. That’s understandable, as the #<< method on our linked list class prepends items to the list instead of appending them, which is a feature of the recursive nature of linked lists.

For situations where it’s essential that the order of the list is retained, we need a way to reverse the list when mapping over it. Ruby implements Enumerable#reverse_each, which loops over an object in reverse. That which sounds like an excellent solution to our problem. Sadly, we can’t use an approach like that because our list is nested. We don’t know how long the list is until we loop over it entirely.

Instead of running the block over the list in reverse, we’ll add a version of #reverse_each that does this two steps. It first loops over the list to reverse it by creating a new list. After that, it runs the block over the reversed list.

1module DIYEnumerable
2  # ...
3
4  def reverse_each(&block)
5    list = LinkedList.new
6    each { |element| list = list << element }
7    list.each(&block)
8  end
9
10  def map
11    result = LinkedList.new
12    reverse_each { |element| result = result << yield(element) }
13    result
14  end
15end

Now, we'll use #reverse_each in our #map method, to make sure it's returned in the correct order.

1irb> list = LinkedList.new(73, 12, 42)
2=> [73, [12, [42]]]
3irb> list.map { |element| element * 10 }
4=> [730, [120, [420]]]

It works! Whenever we call our #map method on a linked list, we'll get a new list back in the same order as the original.

Chaining Enumeration with Enumerators

Through the #each method implemented on our linked list class and the included DIYEnumerator, we can now loop both ways and map over linked lists.

1irb> list.each { |x| p x }
273
312
442
5irb> list.reverse_each { |x| p x }
642
712
873
9irb> list.reverse_each.map { |x| x * 10 }
10=> [730, [120, [420]]]
11=> [420, [120, [730]]]

However, what if we need to map over a list in reverse? Since we now reverse the list before mapping over it, it always returns in the same order as the original list. We've already implemented both #reverse_each and #map, so we should be able to chain them together to be able to map backwards. Luckily, Ruby's Enumerator class can help with that.

Last time, we made sure to call Kernel#to_enum if the LinkedList#each method was called without a block. This allowed for chaining enumerable methods by returning an Enumerator object. To find out how the Enumerator class works, we’ll implement our own version.

1class DIYEnumerator
2  include DIYEnumerable
3
4  def initialize(object, method)
5    @object = object
6    @method = method
7  end
8
9  def each(&block)
10    @object.send(@method, &block)
11  end
12end

Like Ruby's Enumerator, our enumerator class is a wrapper around a method on an object. By bubbling up to the wrapped object, we can chain enumeration methods.

This works because a DIYEnumerator instance is enumerable itself. It implements #each by calling the wrapped object, and includes the DIYEnumerable module so all enumerable methods can be called on it.

We’ll return an instance of our DIYEnumerator class if no block is passed to the LinkedList#each method.

1class LinkedList
2  # ...
3
4  def each(&block)
5    if block_given?
6      yield @head
7      @tail.each(&block) if @tail
8    else
9      DIYEnumerator.new(self, :each)
10    end
11  end
12end

Using our own enumerator, we can now chain enumeration to get the result in the original order without having to pass an empty block to the #reverse_each method call.

1irb> list = LinkedList.new(73, 12, 42)
2=> [73, [12, [42]]]
3irb> list.map { |element| element * 10 }
4=> [420, [120, [730]]]

Eager and lazy enumeration

This concludes our peek into the implementation of the Enumerable module and the Enumerator class for now. We've learned how some of the enumerable methods work and how an enumerator helps chaining enumeration by wrapping an enumerable object.

There are some issues with our approach, though. By its nature, enumeration is eager, meaning it loops over the list as soon as one of the enumerable methods is called on it. While that's fine in most cases, mapping over a list in reverse reverses the list twice, which should be unnecessary.

To lower the number of loops, we could employ Enumerator::Lazy to delay looping to the last moment, and have duplicate list reversing cancel itself out.

We'll have to save that for a future episode, though. Don't want to miss that, and further expeditions into Ruby's magical inner workings? Subscribe to the Ruby Magic e-mail newsletter, to get new articles delivered to your inbox as soon as they're published.

Share this article

RSS

AppSignal monitors your apps

AppSignal provides insights for Ruby, Rails, Elixir, Phoenix, Node.js, Express and many other frameworks and libraries. We are located in beautiful Amsterdam. We love stroopwafels. If you do too, let us know. We might send you some!

Discover AppSignal
AppSignal monitors your apps