Enumerating Enumerable

Enumerating Enumerable

The next method I’m going to cover in Enumerating Enumerable — the series of articles in which I try to do a better job of documenting Ruby’s Enumerable module than Ruby-Doc.org does — is inject, a.k.a. reduce. Not only is it one of the trickiest methods to explain, it’s also one of the cornerstones of functional programming. I thought that I’d take a little time to explain what the function does.

inject

The term inject comes from Smalltalk and isn’t terribly descriptive. I remember reading the documentation for it and being all confused until I saw some examples. I then realized that I’d seen this function before, but under two different names.

reduce

The Second-Best Accordion Picture Ever
Burning Man 1999: gratuitous nudity and even more gratuitous accordion!

The second name by which I encountered this function is reduce, and it was at Burning Man 1999. I was to start a new job the week after Burning Man, and I had to learn at least some basic Python by then. So along with my camping gear, accordion and a kilo of candy for barter, I also brought my laptop (a 233Mhz Toshiba Sattelite with a whopping 96MB of RAM) and O’Reilly’s Learning Python and noodled during the downtime (early morning and afternoon) on Python 1.6. When I got to covering the reduce function, I was confused until I saw some examples, after which I realized that I’d seen that function before, but under a different name.

(You may have also heard of reduce through Google’s much-vaunted MapReduce programming model.)

fold

The first name by which I encountered this function is fold, or more specifically, “fold left” or “foldl”, and it was at the “Programming Paradigms” course I took at Crazy Go Nuts University. “Programming Paradigms” was a second-year course and had the reputation of being the most difficult course in the computer science curriculum. The intended purpose of this course was to provide students with an introduction to functional programming (these days, they use Haskell and Prolog, back then, it was Miranda). Its actual effect was to make most of the students swear off functional programming for the rest of their lives.

In spite of the trauma from this course, I ended up remembering a lot from it that I was able to apply, first to Python and now to Ruby. One of these things is a cute little trick for cememnting in your mind what fold does.

What Ruby-doc.org Says

Before I cover that cute little trick, let’s take a look at what Ruby-doc.org’s documentation has to say about Enumerable‘s inject method.

One thing you’ll find at Ruby-doc.org is that as of Ruby 1.8.7 and later, inject gained a synonym: the more familiar term reduce.

As for the description of the inject/reduce method, I don’t find it terribly helpful:

Combines all elements of enum by applying a binary operation, specified by a block or a symbol that names a method or operator.

If you specify a block, then for each element in enum<i> the block is passed an accumulator value (<i>memo) and the element. If you specify a symbol instead, then each element in the collection will be passed to the named method of memo. In either case, the result becomes the new value for memo. At the end of the iteration, the final value of memo is the return value fo the method.

If you do not explicitly specify an initial value for memo, then uses the first element of collection is used as the initial value of memo.

(Yes, those stray <i> tags are part of the text of the description for inject. Hopefully they’ll fix that soon.)

This confusing text becomes a little clearer with some examples. The most typical example of inject/reduce/fold in action is the classic “compute the sum of the numbers in this range or array” problem. There are a number of approaches you can take in Ruby, all of which use inject/reduce:

The reduce method takes some kind of operation and applies it across the enumerable to yield a single result. In this case, the operation is addition.

Explaining how that operation is applied is a little trickier, but I do just that in the next section.

Demonstrating inject / reduce / fold With a Piece of Paper and Literal Folding

To explain what’s happening in the code above, I’m going to do use a piece of paper. I’ve folded it into 8 even sections and then numbered each section, as shown in the photo below:

Think of the paper as the range (1..8). We’re now going to compute the sum of the numbers in this range, step by step, using a literal fold — that is, by folding the paper. I’m going to start folding from the left side of the paper, and when I do, I’m going to add the numbers that I’m folding into each other.

In the first fold, I’m folding the number 1 onto the number 2. Adding these two numbers yields 3, which I write on the back of the fold:

For the second fold, I fold the first number 3 onto the second number 3. The sum of these two numbers is 6, and I write that on the back of the resulting fold:

I fold again: this time, it’s the number 6 onto the number 4, the sum of which is 10. I write that number down on the resulting fold:

Next, I fold 10 onto 5, yielding the number 15:

I then fold 15 onto 6, which gives me 21:

Next comes 21 folded onto 7, which makes for a sum of 28:

And finally, 28 folded onto 8, which gives us a final total of 36.

And there you have it: a paper-based explanation of inject/reduce/fold, as well as why I often refer to the operation as “folding”.

{ 5 comments }

Enumerating Enumerable: Enumerable#group_by

by Joey deVilla on August 31, 2008

Enumerating Enumerable

Once again, it’s Enumerating Enumerable time! This is the latest in my series of articles where I set out to make better documentation for Ruby’s Enumerable module than Ruby-Doc.org’s. In this installment — the seventeenth in the series — I cover the group_by method.

In case you missed any of the previous articles, they’re listed and linked below:

  1. all?
  2. any?
  3. collect / map
  4. count
  5. cycle
  6. detect / find
  7. drop
  8. drop_while
  9. each_cons
  10. each_slice
  11. each_with_index
  12. entries / to_a
  13. find_all / select
  14. find_index
  15. first
  16. grep

Enumerable#group_by Quick Summary

Graphic representation of the "group_by" method in Ruby's "Enumerable" module.

In the simplest possible terms Break a collection into groups based on some given criteria.
Ruby version 1.9 only
Expects A block containing the criteria by which the items in the collection will be grouped.
Returns A hash where each key represents a group. Each key’s corresponding value is an array containing the members of that group.
RubyDoc.org’s entry Enumerable#group_by

Enumerable#group_by and Arrays

When used on an array, group_by iterates through the array, passing each element to to the block. The result value of the block is the group into which the element will be placed.

Example 1

For the first example, I’ll use some code similar to the example given in Ruby-doc.org’s writeup of group_by:

In the code above, the numbers 0 through 15 are passed to the block, which receives each number as the parameter number. The group that each number is placed into is determined by the result value of the block, number % 3, whose result can be one of 0, 1 or 2. This means that:

  • The resulting hash will have three groups, represented by the keys 0, 1 and 2
  • The key 0‘s corresponding value is an array containing the numbers in the range (0..15) that are evenly divisible by 3 (i.e. the numbers for which number % 3 is 0.
  • The key 1‘s corresponding value is an array containing the numbers in the range (0..15) that when divided by 3 leave a remainder of 1 (i.e. the numbers for which number % 3 is 1.
  • The key 2‘s corresponding value is an array containing the numbers in the range (0..15) that when divided by 3 leave a remainder of 2 (i.e. the numbers for which number % 3 is 2.

Example 2

In the first example, the keys in the resulting hash are the same type as the values in the array whose contents we’re grouping. In this example, I’ll show that the keys in the resulting hash don’t have to be the same type as the values in the array.

In the code above, each Simpson name is passed to the block, which receives it as the parameter simpson. The block’s result is the length of simpson, and this result is the group into which the name will go.

In the resulting hash:

  • Note that the keys are integers while the names in the groups are strings.
  • The key 5‘s array contains those names in Simpsons that are 5
    characters in length.
  • The key 4‘s array contains those names in Simpsons that are 4 characters in length.
  • The key 7‘s array contains those names in Simpsons that are 7 characters in length.

Example 3

In the previous two examples, the keys for the resulting array were calculated from the values in the initial array. In this example, I’ll demonstrate that the keys for the groupings can be determined in a completely arbitrary fashion that has nothing to do with the values:

Enumerable#group_by and Hashes

When used on a hash, group_by passes each key/value pair in the hash to the block, which you can “catch” as either:

1. A two-element array, with the key as element 0 and its corresponding value as element 1, or
2. Two separate items, with the key as the first item and its corresponding value as the second item.

Example 1

In this example, we’ll group the cast of Family Guy by the item that they’re bringing to a potluck dinner:

Example 2

In the previous example, the groupings were based on a calculation performed on the objects in the original hash. In this example, the groupings will be random: a random number generator will determine whose car each potluck attendee will ride to the potluck dinner:

{ 1 comment }

Enumerating Enumerable: Enumerable#first

by Joey deVilla on August 15, 2008

Enumerating Enumerable

Welcome to another installment of Enumerating Enumerable, my series of articles in I attempt to do a better job of documenting Ruby’s Enumerable module than Ruby-Doc.org. In this installment, I cover the first method.

In case you missed any of the previous articles, they’re listed and linked below:

  1. all?
  2. any?
  3. collect / map
  4. count
  5. cycle
  6. detect / find
  7. drop
  8. drop_while
  9. each_cons
  10. each_slice
  11. each_with_index
  12. entries / to_a
  13. find_all / select
  14. find_index

Enumerable#first Quick Summary

Graphic representing the "first" method in Ruby's "Enumerable" module

In the simplest possible terms What are the first n items in the collection?
Ruby version 1.8 and 1.9
Expects An optional integer n that specifies the first n items of the collection to return. If this integer is not given, n is 1 by default.
Returns If first is applied to a collection containing m elements:

  • The first item in the collection, if m > 0 and no argument n is provided.
  • An array containing the first n items in the collection, if m > 0 and an argument n is provided.
  • nil if the collection is empty and no argument n is provided.
  • The empty array [] if the collection is empty and an argument n is provided.
RubyDoc.org’s entry Enumerable#first

Enumerable#first and Arrays

When used on an array without an argument, first returns the first item in the array:

When used on an array with an integer argument n, first returns an array containing the first n items in the original array:

When used on an empty array, first returns:

  • nil if no argument n is provided
  • The empty array, [], if an argument n is provided

Enumerable#first and Hashes

In Ruby 1.8 and previous versions, hash order is seemingly arbitrary. Starting with Ruby 1.9, hashes retain the order in which they were defined, which makes the first method a little more applicable.

When used on a hash without an argument, first returns the first item in the hash as a two-element array, with the key as the first element and the corresponding value as the second element.

When used on a hash with an integer argument n, first returns an array containing the first n items in the hash, with each item represented as a two-element array:

When used on an empty hash, first returns:

  • nil if no argument n is provided
  • The empty array, [], if an argument n is provided

{ 0 comments }

Enumerating Enumerable: Enumerable#find_index

by Joey deVilla on August 14, 2008

Enumerating Enumerable

Once again, it’s Enumerating Enumerable, my series of articles in which I attempt to outdo Ruby-Doc.org’s documentation of Ruby’s Enumerable module. In this article, I cover the find_index method, which was introduced in Ruby 1.9.

In case you missed any of the previous articles, they’re listed and linked below:

  1. all?
  2. any?
  3. collect / map
  4. count
  5. cycle
  6. detect / find
  7. drop
  8. drop_while
  9. each_cons
  10. each_slice
  11. each_with_index
  12. entries / to_a
  13. find_all / select

Enumerable#find_index Quick Summary

Graphic representation of the "find_index" method in Ruby's "Enumerable" module

In the simplest possible terms What’s the index of the first item in the collection that meets the given criteria?
Ruby version 1.9
Expects A block containing the criteria.
Returns
  • The index of the item in the collection that matches the criteria, if there is one.
  • nil, if no item in the collection matches the crtieria.
RubyDoc.org’s entry Enumerable#find_index

Enumerable#find_index and Arrays

When used on an array, find_index passes each item in the array to the given block and either:

  • Stops when the current item causes the block to return a value that evaluates to true (that is, anything that isn’t false or nil) and returns the index of that item, or
  • Returns nil if there is no item in the array that causes the block to return a value that evaluates to true.

Some examples:

Enumerable#find_index and Hashes

When used on a hash, find_index passes each key/value pair in the hash to the block, which you can “catch” as either:

  1. A two-element array, with the key as element 0 and its corresponding value as element 1, or
  2. Two separate items, with the key as the first item and its corresponding value as the second item.

As with arrays, find_index:

  • Stops when the current item causes the block to return a value that evaluates to true (that is, anything that isn’t false or nil) and returns the index of that item, or
  • Returns nil if there is no item in the array that causes the block to return a value that evaluates to true.

Some examples:

Using find_index as a Membership Test

Although Enumerable has a method for checking whether an item is a member of a collection (the include? method and its synonym, member?), find_index is a more powerful membership test for two reasons:

  1. include?/member? only check membership by using the == operator, while find_index lets you define a block to set up all sorts of tests. include?/member? asks “Is there an object X in the collection equal to my object Y?” while find_index can be used to ask “Is there an object X in the collection that matches these criteria?”
  2. include?/member? returns true if there is an object X in the collection that is equal to the given object Y. find_index goes one step further: not only can it be used to report the equivalent of true if there is an object X in the collection that is equal to the given object Y, it also reports its location in the collection.

A quick example of this use in action:

Parts that Haven’t Been Implemented Yet

Ruby-Doc.org’s documentation is generated from the comments in the C implementation of Ruby. It mentions a way of calling find_index that is just like calling include?/member?:


Ruby 1.9 is considered to be a work in progress, so I suppose it’ll get implemented in a later release.

{ 1 comment }

Enumerating Enumerable: Enumerable#drop_while

by Joey deVilla on July 25, 2008

After the wackiness of the past couple of weeks — some travel to see family, followed by a busy week of tech events including DemoCamp 18, Damian Conway’s presentation, FAILCamp and RubyFringe — I’m happy to return to Enumerating Enumerable, the article series in which I attempt to do a better job at documenting Ruby’s Enumerable module than Ruby-Doc.org does.

In this article, the eighth in the series, I’m going to cover a method introduced in Ruby 1.9: drop_while.

I’m going through the Enumerable‘s methods in alphabetical order. If you missed any of the earlier articles, I’ve listed them all below:

  1. all?
  2. any?
  3. collect / map
  4. count
  5. cycle
  6. detect / find
  7. drop

Enumerable#drop_while Quick Summary

Graphic representation of the \"drop_while\" method in Ruby\'s \"Enumerable\" module

In the simplest possible terms Given a collection and a condition, return an array made of the collection’s items, starting the first item that doesn’t meet the condition.
Ruby version 1.9 only
Expects A block containing the condition.
Returns An array made up of the remaining items, if there are any.
RubyDoc.org’s entry Enumerable#drop_while

Enumerable#drop_while and Arrays

When used on an array, drop_while returns a copy of the array created by going through the original array’s items in order, dropping elements until it encounters the an element that does not meet the condition. The resulting array is basically a copy of the original array, starting at the first element that doesn’t meet the condition in the block.

As in many cases, things become clearer with some examples:

Enumerable#drop_while and Hashes

When used on a hash, drop_while effectively:

  • Creates an array based on the hash, with each element in the hash represented as a two-element array where the first element contains the key and the second element containing the corresponding value, then
  • goes through each element in the array, dropping elements until it encounters the first element that doesn’t meet the condition in the block. The resulting array is an array of two-element arrays, starting at the first element that doesn’t meet the condition in the block.

Once again, examples will make this all clear:

Enumerable#drop_while’s Evil Twin, Enumerable#take_while

I’ll cover take_while in detail in a later installment, but for now, an example should suffice:

{ 4 comments }

Here’s another article in the Enumerating Enumerable series, in which I attempt to improve upon RubyDoc.org’s documentation for the Enumerable module, which I find rather lacking. If you’ve missed the previous articles in the series, I’ve listed them below:

  1. all?
  2. any?
  3. collect / map
  4. count
  5. cycle

This installment covers a method that goes by two names: detect or find. I personally prefer find, as it’s shorter and the term I tend to use for its function.

Enumerable#detect/Enumerable#find Quick Summary

Graphic representation of the \"detect\" or \"find\" method in Ruby\'s \"Enumerable\" module.

In the simplest possible terms What’s the first item in the collection that meets the given criteria?
Ruby version 1.8 and 1.9
Expects
  • A block containing the criteria.
  • An optional argument containing a proc that calculates a “default” value — that is, the value to return if no item in the collection matches the criteria.
Returns The first item in the collection that matches the criteria, if one exists.
If no such item exists in the collection, detect/find returns:
  • nil is returned if no argument is provided
  • the value of the argument, if one is provided.
RubyDoc.org’s entry Enumerable#detect / Enumerable#find

Enumerable#detect/Enumerable#find and Arrays

When used on an array without an argument, detect/find passes each item from the collection to the block and…

  • If the current item causes the block to return a value that doesn’t evaluate to false, detect/find stops going through collection and returns the item.
  • If no item in the collection causes the block to return a value that doesn’t evaluate to false, detect/find returns nil.

In the examples that follow, I’ll be using the find method. detect does exactly the same thing; it’s just that I prefer find.

Using the optional argument is a topic big enough to merit its own section, which appears later in this article.

Enumerable#detect/Enumerable#find and Hashes

When used on a hash and a block is provided, detect/find passes each key/value pair in the hash to the block, which you can “catch” as either:

  1. A two-element array, with the key as element 0 and its corresponding value as element 1, or
  2. Two separate items, with the key as the first item and its corresponding value as the second item.

When used on a hash without an argument, detect/find passes each item from the collection to the block and…

  • If the current item causes the block to return a value that doesn’t evaluate to false, detect/find stops going through collection and returns the item.
  • If no item in the collection causes the block to return a value that doesn’t evaluate to false, detect/find returns nil.

Using Enumerable#detect/Enumerable#find with the Optional Argument

detect/find‘s optional argument lets you specify a proc or lambda whose return value will be the result in cases where no object in the collection matches the criteria.

(Unfortunately, a complete discussion of procs and lambdas is beyond the scope of this article. I highly recommend looking at Eli Bendersky’s very informative article, Understanding Ruby blocks, Procs and methods.)

I think that the optional argument is best explained through examples…

To see a “real” application of detect/find's optional argument, see this Ruby Quiz problem.

{ 3 comments }

Enumerating Enumerable: Enumerable#count

by Joey deVilla on July 2, 2008

Welcome to the fourth installment of Enumerating Enumerable, a series of articles in which I challenge myself to do a better job of documenting Ruby’s Enumerable module than RubyDoc.org does. In this article, I’ll cover Enumerable#count, one of the new methods added to Enumerable in Ruby 1.9.

In case you missed the earlier installments, they’re listed (and linked) below:

  1. all?
  2. any?
  3. collect / map

Enumerable#count Quick Summary

Graphic representation of the Enumberable#count method in Ruby

In the simplest possible terms How many items in the collection meet the given criteria?
Ruby version 1.9 only
Expects Either:

  • An argument to be matched against the items in the collection
  • A block containing an expression to test the items in the collection
Returns The number of items in the collection that meet the given criteria.
RubyDoc.org’s entry Enumerable#count

Enumerable#count and Arrays

When used on an array and an argument is provided, count returns the number of times the value of the argument appears in the array:

When used on an array and a block is provided, count returns the number of items in the array for which the block returns true:

RubyDoc.org says that when count is used on an array without an argument or a block, it simply returns the number of items in the array (which is what the length/size methods do). However, when I’ve tried it in irb and ruby, I got results like this:

Enumerable#count and Hashes

As with arrays, when used on a hash and an argument is provided, count returns the number of times the value of the argument appears in the hash. The difference is that for the comparison, each key/value pair is treated as a two-element array, with the key being element 0 and the value being element 1.

count is not useful when used with a hash and an argument. It will only ever return two values:

  • 1 if the argument is a two-element array and there is an item in the hash whose key matches element [0] of the array and whose value matches element [1] of the array.
  • 0 for all other cases.

When used with a hash and a block, count is more useful. count passes each key/value pair in the hash to the block, which you can “catch” as either:

  1. A two-element array, with the key as element 0 and its corresponding value as element 1, or
  2. Two separate items, with the key as the first item and its corresponding value as the second item.

Each key/value pair is passed to the block and count returns the number of items in the hash for which the block returns true.

(You should probably skip The Love Guru completely, or at least until it gets aired on TV for free.)

{ 5 comments }