Categories
Uncategorized

Enumerating Enumerable: A Cute Trick for Explaining inject / reduce / fold

Enumerating Enumerable

The next method I’m going to cover in Enumerating Enumerable — the series of articles in which I try to do a better job of documenting Ruby’s Enumerable module than Ruby-Doc.org does — is inject, a.k.a. reduce. Not only is it one of the trickiest methods to explain, it’s also one of the cornerstones of functional programming. I thought that I’d take a little time to explain what the function does.

inject

The term inject comes from Smalltalk and isn’t terribly descriptive. I remember reading the documentation for it and being all confused until I saw some examples. I then realized that I’d seen this function before, but under two different names.

reduce

The Second-Best Accordion Picture Ever
Burning Man 1999: gratuitous nudity and even more gratuitous accordion!

The second name by which I encountered this function is reduce, and it was at Burning Man 1999. I was to start a new job the week after Burning Man, and I had to learn at least some basic Python by then. So along with my camping gear, accordion and a kilo of candy for barter, I also brought my laptop (a 233Mhz Toshiba Sattelite with a whopping 96MB of RAM) and O’Reilly’s Learning Python and noodled during the downtime (early morning and afternoon) on Python 1.6. When I got to covering the reduce function, I was confused until I saw some examples, after which I realized that I’d seen that function before, but under a different name.

(You may have also heard of reduce through Google’s much-vaunted MapReduce programming model.)

fold

The first name by which I encountered this function is fold, or more specifically, “fold left” or “foldl”, and it was at the “Programming Paradigms” course I took at Crazy Go Nuts University. “Programming Paradigms” was a second-year course and had the reputation of being the most difficult course in the computer science curriculum. The intended purpose of this course was to provide students with an introduction to functional programming (these days, they use Haskell and Prolog, back then, it was Miranda). Its actual effect was to make most of the students swear off functional programming for the rest of their lives.

In spite of the trauma from this course, I ended up remembering a lot from it that I was able to apply, first to Python and now to Ruby. One of these things is a cute little trick for cememnting in your mind what fold does.

What Ruby-doc.org Says

Before I cover that cute little trick, let’s take a look at what Ruby-doc.org’s documentation has to say about Enumerable‘s inject method.

One thing you’ll find at Ruby-doc.org is that as of Ruby 1.8.7 and later, inject gained a synonym: the more familiar term reduce.

As for the description of the inject/reduce method, I don’t find it terribly helpful:

Combines all elements of enum by applying a binary operation, specified by a block or a symbol that names a method or operator.

If you specify a block, then for each element in enum<i> the block is passed an accumulator value (<i>memo) and the element. If you specify a symbol instead, then each element in the collection will be passed to the named method of memo. In either case, the result becomes the new value for memo. At the end of the iteration, the final value of memo is the return value fo the method.

If you do not explicitly specify an initial value for memo, then uses the first element of collection is used as the initial value of memo.

(Yes, those stray <i> tags are part of the text of the description for inject. Hopefully they’ll fix that soon.)

This confusing text becomes a little clearer with some examples. The most typical example of inject/reduce/fold in action is the classic “compute the sum of the numbers in this range or array” problem. There are a number of approaches you can take in Ruby, all of which use inject/reduce:

(1..8).reduce(:+)
=> 36

(1..8).reduce {|sum, number| sum += number}
=> 36

(1..8).reduce(0) {|sum, number| sum += number}
=> 36

The reduce method takes some kind of operation and applies it across the enumerable to yield a single result. In this case, the operation is addition.

Explaining how that operation is applied is a little trickier, but I do just that in the next section.

Demonstrating inject / reduce / fold With a Piece of Paper and Literal Folding

To explain what’s happening in the code above, I’m going to do use a piece of paper. I’ve folded it into 8 even sections and then numbered each section, as shown in the photo below:

Think of the paper as the range (1..8). We’re now going to compute the sum of the numbers in this range, step by step, using a literal fold — that is, by folding the paper. I’m going to start folding from the left side of the paper, and when I do, I’m going to add the numbers that I’m folding into each other.

In the first fold, I’m folding the number 1 onto the number 2. Adding these two numbers yields 3, which I write on the back of the fold:

For the second fold, I fold the first number 3 onto the second number 3. The sum of these two numbers is 6, and I write that on the back of the resulting fold:

I fold again: this time, it’s the number 6 onto the number 4, the sum of which is 10. I write that number down on the resulting fold:

Next, I fold 10 onto 5, yielding the number 15:

I then fold 15 onto 6, which gives me 21:

Next comes 21 folded onto 7, which makes for a sum of 28:

And finally, 28 folded onto 8, which gives us a final total of 36.

And there you have it: a paper-based explanation of inject/reduce/fold, as well as why I often refer to the operation as “folding”.

Categories
Uncategorized

Enumerating Enumerable: Enumerable#group_by

Enumerating Enumerable

Once again, it’s Enumerating Enumerable time! This is the latest in my series of articles where I set out to make better documentation for Ruby’s Enumerable module than Ruby-Doc.org’s. In this installment — the seventeenth in the series — I cover the group_by method.

In case you missed any of the previous articles, they’re listed and linked below:

  1. all?
  2. any?
  3. collect / map
  4. count
  5. cycle
  6. detect / find
  7. drop
  8. drop_while
  9. each_cons
  10. each_slice
  11. each_with_index
  12. entries / to_a
  13. find_all / select
  14. find_index
  15. first
  16. grep

Enumerable#group_by Quick Summary

Graphic representation of the "group_by" method in Ruby's "Enumerable" module.

In the simplest possible terms Break a collection into groups based on some given criteria.
Ruby version 1.9 only
Expects A block containing the criteria by which the items in the collection will be grouped.
Returns A hash where each key represents a group. Each key’s corresponding value is an array containing the members of that group.
RubyDoc.org’s entry Enumerable#group_by

Enumerable#group_by and Arrays

When used on an array, group_by iterates through the array, passing each element to to the block. The result value of the block is the group into which the element will be placed.

Example 1

For the first example, I’ll use some code similar to the example given in Ruby-doc.org’s writeup of group_by:

(0..15).group_by {|number| number % 3}
=> {0=>[0, 3, 6, 9, 12, 15], 1=>[1, 4, 7, 10, 13], 2=>[2, 5, 8, 11, 14]}

In the code above, the numbers 0 through 15 are passed to the block, which receives each number as the parameter number. The group that each number is placed into is determined by the result value of the block, number % 3, whose result can be one of 0, 1 or 2. This means that:

  • The resulting hash will have three groups, represented by the keys 0, 1 and 2
  • The key 0‘s corresponding value is an array containing the numbers in the range (0..15) that are evenly divisible by 3 (i.e. the numbers for which number % 3 is 0.
  • The key 1‘s corresponding value is an array containing the numbers in the range (0..15) that when divided by 3 leave a remainder of 1 (i.e. the numbers for which number % 3 is 1.
  • The key 2‘s corresponding value is an array containing the numbers in the range (0..15) that when divided by 3 leave a remainder of 2 (i.e. the numbers for which number % 3 is 2.

Example 2

In the first example, the keys in the resulting hash are the same type as the values in the array whose contents we’re grouping. In this example, I’ll show that the keys in the resulting hash don’t have to be the same type as the values in the array.

simpsons = %w(Homer Marge Bart Lisa Abraham Herb)
=> ["Homer", "Marge", "Bart", "Lisa", "Abraham", "Herb"]

simpsons.group_by{|simpson| simpson.length}
=> {5=>["Homer", "Marge"], 4=>["Bart", "Lisa", "Herb"], 7=>["Abraham"]}

In the code above, each Simpson name is passed to the block, which receives it as the parameter simpson. The block’s result is the length of simpson, and this result is the group into which the name will go.

In the resulting hash:

  • Note that the keys are integers while the names in the groups are strings.
  • The key 5‘s array contains those names in Simpsons that are 5
    characters in length.
  • The key 4‘s array contains those names in Simpsons that are 4 characters in length.
  • The key 7‘s array contains those names in Simpsons that are 7 characters in length.

Example 3

In the previous two examples, the keys for the resulting array were calculated from the values in the initial array. In this example, I’ll demonstrate that the keys for the groupings can be determined in a completely arbitrary fashion that has nothing to do with the values:

# Put the Simpsons into randomly determined groups
simpsons.group_by{rand(3) + 1}
=> {3=>["Homer", "Bart", "Abraham", "Herb"], 1=>["Marge", "Lisa"]}

# Let's try that again. The results are very likely to be different:
simpsons.group_by{rand(3) + 1}
=> {1=>["Homer", "Bart"], 2=>["Marge", "Lisa", "Herb"], 3=>["Abraham"]}

# One more time!
simpsons.group_by{rand(3) + 1}
=> {2=>["Homer", "Bart", "Lisa"], 3=>["Marge", "Herb"], 1=>["Abraham"]}

Enumerable#group_by and Hashes

When used on a hash, group_by passes each key/value pair in the hash to the block, which you can “catch” as either:

1. A two-element array, with the key as element 0 and its corresponding value as element 1, or
2. Two separate items, with the key as the first item and its corresponding value as the second item.

Example 1

In this example, we’ll group the cast of Family Guy by the item that they’re bringing to a potluck dinner:

potluck = {"Peter" => "lasagna",
           "Lois"  => "potato salad",
           "Chris" => "lasagna",
           "Meg"   => "brownies",
           "Stewie" => "chateaubriand",
           "Brian" => "potato salad",
           "Evil Monkey" => "potato salad"}
=> {"Peter"=>"lasagna", "Lois"=>"potato salad", "Chris"=>"lasagna", "Meg"=>"brownies",
"Stewie"=>"chateaubriand", "Brian"=>"potato salad", "Evil Monkey"=>"potato salad"}

# Here's one way to do it:
potluck.group_by{|person, bringing| bringing}
=> {"lasagna"=>[["Peter", "lasagna"], ["Chris", "lasagna"]], "potato salad"=>[["Lois", "potato salad"],
["Brian", "potato salad"], ["Evil Monkey", "potato salad"]], "brownies"=>[["Meg", "brownies"]],
"chateaubriand"=>[["Stewie", "chateaubriand"]]}

# Here's another way to do it:
potluck.group_by{|person| person[1]}
=> {"lasagna"=>[["Peter", "lasagna"], ["Chris", "lasagna"]], "potato salad"=>[["Lois", "potato salad"],
["Brian", "potato salad"], ["Evil Monkey", "potato salad"]], "brownies"=>[["Meg", "brownies"]],
"chateaubriand"=>[["Stewie", "chateaubriand"]]}

Example 2

In the previous example, the groupings were based on a calculation performed on the objects in the original hash. In this example, the groupings will be random: a random number generator will determine whose car each potluck attendee will ride to the potluck dinner:

potluck.group_by {[:peters_car, :quagmires_car, :clevelands_car][rand(3)]}
=> {:peters_car=>[["Peter", "lasagna"], ["Chris", "lasagna"], ["Evil Monkey", "potato salad"]],
:quagmires_car=>[["Lois", "potato salad"], ["Meg", "brownies"], ["Stewie", "chateaubriand"]],
:clevelands_car=>[["Brian", "potato salad"]]}

# Let's try another random grouping
potluck.group_by {[:peters_car, :quagmires_car, :clevelands_car][rand(3)]}
=> {:peters_car=>[["Peter", "lasagna"], ["Meg", "brownies"]], :quagmires_car=>[["Lois", "potato salad"],
["Stewie", "chateaubriand"], ["Brian", "potato salad"], ["Evil Monkey", "potato salad"]],
:clevelands_car=>[["Chris", "lasagna"]]}

# One more time!
potluck.group_by {[:peters_car, :quagmires_car, :clevelands_car][rand(3)]}
=> {:peters_car=>[["Peter", "lasagna"], ["Chris", "lasagna"], ["Stewie", "chateaubriand"]],
:quagmires_car=>[["Lois", "potato salad"], ["Evil Monkey", "potato salad"]], :clevelands_car=>[["Meg", "brownies"],
["Brian", "potato salad"]]}

Categories
Uncategorized

Enumerating Enumerable: Enumerable#first

Enumerating Enumerable

Welcome to another installment of Enumerating Enumerable, my series of articles in I attempt to do a better job of documenting Ruby’s Enumerable module than Ruby-Doc.org. In this installment, I cover the first method.

In case you missed any of the previous articles, they’re listed and linked below:

  1. all?
  2. any?
  3. collect / map
  4. count
  5. cycle
  6. detect / find
  7. drop
  8. drop_while
  9. each_cons
  10. each_slice
  11. each_with_index
  12. entries / to_a
  13. find_all / select
  14. find_index

Enumerable#first Quick Summary

Graphic representing the "first" method in Ruby's "Enumerable" module

In the simplest possible terms What are the first n items in the collection?
Ruby version 1.8 and 1.9
Expects An optional integer n that specifies the first n items of the collection to return. If this integer is not given, n is 1 by default.
Returns If first is applied to a collection containing m elements:

  • The first item in the collection, if m > 0 and no argument n is provided.
  • An array containing the first n items in the collection, if m > 0 and an argument n is provided.
  • nil if the collection is empty and no argument n is provided.
  • The empty array [] if the collection is empty and an argument n is provided.
RubyDoc.org’s entry Enumerable#first

Enumerable#first and Arrays

When used on an array without an argument, first returns the first item in the array:

posts = ["First post!", "Second post!", "Third post!"]
=> ["First post!", "Second post!", "Third post!"]

# What's the first item in posts?
posts.first
=> "First post!"

# Here's the equivalent using array notation:
posts[0]
=> "First post!"

When used on an array with an integer argument n, first returns an array containing the first n items in the original array:

# What are the first 2 items in posts?
posts.first 2
=> ["First post!", "Second post!"]

# Note that when you provide an argument of 1,
# the result is still an array -- with just one element.
# If you want a scalar, don't use an argument.
posts.first 1
=> ["First post!"]

# Here's the equivalent using array slice notation:
posts[0..1]
=> ["First post!", "Second post!"]

posts[0...2]
=> ["First post!", "Second post!"]

When used on an empty array, first returns:

  • nil if no argument n is provided
  • The empty array, [], if an argument n is provided

[].first
=> nil

[].first 2
=> []

Enumerable#first and Hashes

In Ruby 1.8 and previous versions, hash order is seemingly arbitrary. Starting with Ruby 1.9, hashes retain the order in which they were defined, which makes the first method a little more applicable.

When used on a hash without an argument, first returns the first item in the hash as a two-element array, with the key as the first element and the corresponding value as the second element.

# Let's see what stages of partying our friends are in
party_stages = {"Alice" => :party_mineral,
                "Bob"   => :party_animal,
                "Carol" => :party_reptile,
                "Dave"  => :party_vegetable}
=> {"Alice"=>:party_mineral, "Bob"=>:party_animal, "Carol"=>:party_reptile, "Dave"=>:party_vegetable}

# Who's the first partier and what state is s/he in?
party_stages.first
=> ["Alice", :party_mineral]

When used on a hash with an integer argument n, first returns an array containing the first n items in the hash, with each item represented as a two-element array:

party_stages.first 2
=> [["Alice", :party_mineral], ["Bob", :party_animal]]

# Note that when you provide an argument of 1,
# the result is still an array -- with just one element.
# If you want a scalar, don't use an argument.
party_stages.first 1
=> [["Alice", :party_mineral]]

When used on an empty hash, first returns:

  • nil if no argument n is provided
  • The empty array, [], if an argument n is provided

{}.first
=> nil

{}.first 2
=> []

Categories
Uncategorized

Enumerating Enumerable: Enumerable#find_index

Enumerating Enumerable

Once again, it’s Enumerating Enumerable, my series of articles in which I attempt to outdo Ruby-Doc.org’s documentation of Ruby’s Enumerable module. In this article, I cover the find_index method, which was introduced in Ruby 1.9.

In case you missed any of the previous articles, they’re listed and linked below:

  1. all?
  2. any?
  3. collect / map
  4. count
  5. cycle
  6. detect / find
  7. drop
  8. drop_while
  9. each_cons
  10. each_slice
  11. each_with_index
  12. entries / to_a
  13. find_all / select

Enumerable#find_index Quick Summary

Graphic representation of the "find_index" method in Ruby's "Enumerable" module

In the simplest possible terms What’s the index of the first item in the collection that meets the given criteria?
Ruby version 1.9
Expects A block containing the criteria.
Returns
  • The index of the item in the collection that matches the criteria, if there is one.
  • nil, if no item in the collection matches the crtieria.
RubyDoc.org’s entry Enumerable#find_index

Enumerable#find_index and Arrays

When used on an array, find_index passes each item in the array to the given block and either:

  • Stops when the current item causes the block to return a value that evaluates to true (that is, anything that isn’t false or nil) and returns the index of that item, or
  • Returns nil if there is no item in the array that causes the block to return a value that evaluates to true.

Some examples:

# How about an array of the name of the first cosmonauts and astronauts,
# listed in the chronological order of the missions?
mission_leaders = ["Gagarin", "Shepard", "Grissom", "Titov", "Glenn", "Carpenter",
"Nikolayev", "Popovich"]
=> ["Gagarin", "Shepard", "Grissom", "Titov", "Glenn", "Carpenter", "Nikolayev",
"Popovich"]

# Yuri Gagarin was the first in space
mission_leaders.find_index{|leader| leader == "Gagarin"}
=> 0

# John Glenn was the fifth
mission_leaders.find_index{|leader| leader == "Glenn"}
=> 4

# And James Tiberius Kirk is not listed in the array
kirk_present = mission_leaders.find_index{|leader| leader == "Kirk"}
=> nil

Enumerable#find_index and Hashes

When used on a hash, find_index passes each key/value pair in the hash to the block, which you can “catch” as either:

  1. A two-element array, with the key as element 0 and its corresponding value as element 1, or
  2. Two separate items, with the key as the first item and its corresponding value as the second item.

As with arrays, find_index:

  • Stops when the current item causes the block to return a value that evaluates to true (that is, anything that isn’t false or nil) and returns the index of that item, or
  • Returns nil if there is no item in the array that causes the block to return a value that evaluates to true.

Some examples:

require 'date'
=> true

# These are the names of the first manned spaceships and their launch dates
launch_dates = {"Kedr"              => Date.new(1961, 4, 12),
                "Freedom 7"         => Date.new(1961, 5, 5),
                "Liberty Bell 7"    => Date.new(1961, 7, 21),
                "Orel"              => Date.new(1961, 8, 6),
                "Friendship 7"      => Date.new(1962, 2, 20),
                "Aurora 7"          => Date.new(1962, 5, 24),
                "Sokol"             => Date.new(1962, 8, 11),
                "Berkut"            => Date.new(1962, 8, 12)}
=> {"Kedr"=>#<Date: 4874803/2,0,2299161>, "Freedom 7"=>#<Date: 4874849/2,0,2299161>,
"Liberty Bell 7"=>#<Date: 4875003/2,0,2299161>, "Orel"=>#<Date: 4875035/2,0,2299161>,
"Friendship 7"=>#<Date: 4875431/2,0,2299161>, "Aurora 7"=>#<Date: 4875617/2,0,2299161>,
"Sokol"=>#<Date: 4875775/2,0,2299161>, "Berkut"=>#<Date: 4875777/2,0,2299161>}

# Where in the list is John Glenn's ship, the Friendship 7?
launch_dates.find_index{|ship, date| ship == "Friendship 7"}
=> 4

# Where in the list is the first mission launched in August 1962?
launch_dates.find_index{|ship, date| date.year == 1962 && date.month == 8}
=> 6

# The same thing, expressed a little differently
launch_dates.find_index{|launch| launch[1].year == 1962 && launch[1].month == 8}
=> 6

Using find_index as a Membership Test

Although Enumerable has a method for checking whether an item is a member of a collection (the include? method and its synonym, member?), find_index is a more powerful membership test for two reasons:

  1. include?/member? only check membership by using the == operator, while find_index lets you define a block to set up all sorts of tests. include?/member? asks “Is there an object X in the collection equal to my object Y?” while find_index can be used to ask “Is there an object X in the collection that matches these criteria?”
  2. include?/member? returns true if there is an object X in the collection that is equal to the given object Y. find_index goes one step further: not only can it be used to report the equivalent of true if there is an object X in the collection that is equal to the given object Y, it also reports its location in the collection.

A quick example of this use in action:

# Once again, the mission leaders
mission_leaders = ["Gagarin", "Shepard", "Grissom", "Titov", "Glenn", "Carpenter",
"Nikolayev", "Popovich"]
=> ["Gagarin", "Shepard", "Grissom", "Titov", "Glenn", "Carpenter", "Nikolayev",
 "Popovich"]

# Yuri Gagarin is in the list
gagarin_in_list = mission_leaders.find_index {|leader| leader == "Gagarin"}
=> 0

# Captain James T. Kirk is not
kirk_in_list = mission_leaders.find_index {|leader| leader == "Kirk"}
=> nil

# gagarin_in_list is 0, which as a non-false and non-nil value evaluates as true.
# We can use it as both a membership test *and* as his location in the list.
p "Gagarin's there. He's number #{gagarin_in_list + 1} in the list." if gagarin_in_list
"Gagarin's there. He's number 1 in the list."
=> "Gagarin's there. He's number 1 in the list."

# kirk_in_list is nil, which is one of Ruby's two "false" values.
# Let's use it with the "something OR something else" idiom that
# many Ruby programmers like.
kirk_in_list || (p "You only *think* he wasn't there.")
"You only *think* he wasn't there."
=> "You only *think* he wasn't there."

Parts that Haven’t Been Implemented Yet

Ruby-Doc.org’s documentation is generated from the comments in the C implementation of Ruby. It mentions a way of calling find_index that is just like calling include?/member?:

# What the docs say (does not work yet!)
["Alice", "Bob", "Carol"].find_index("Bob")
=> 1

# What happens with the current version of Ruby 1.9
["Alice", "Bob", "Carol"].find_index("Bob")
ArgumentError: wrong number of arguments(1 for 0)
...


Ruby 1.9 is considered to be a work in progress, so I suppose it’ll get implemented in a later release.

Categories
Uncategorized

Enumerating Enumerable: Enumerable#drop_while

After the wackiness of the past couple of weeks — some travel to see family, followed by a busy week of tech events including DemoCamp 18, Damian Conway’s presentation, FAILCamp and RubyFringe — I’m happy to return to Enumerating Enumerable, the article series in which I attempt to do a better job at documenting Ruby’s Enumerable module than Ruby-Doc.org does.

In this article, the eighth in the series, I’m going to cover a method introduced in Ruby 1.9: drop_while.

I’m going through the Enumerable‘s methods in alphabetical order. If you missed any of the earlier articles, I’ve listed them all below:

  1. all?
  2. any?
  3. collect / map
  4. count
  5. cycle
  6. detect / find
  7. drop

Enumerable#drop_while Quick Summary

Graphic representation of the \"drop_while\" method in Ruby\'s \"Enumerable\" module

In the simplest possible terms Given a collection and a condition, return an array made of the collection’s items, starting the first item that doesn’t meet the condition.
Ruby version 1.9 only
Expects A block containing the condition.
Returns An array made up of the remaining items, if there are any.
RubyDoc.org’s entry Enumerable#drop_while

Enumerable#drop_while and Arrays

When used on an array, drop_while returns a copy of the array created by going through the original array’s items in order, dropping elements until it encounters the an element that does not meet the condition. The resulting array is basically a copy of the original array, starting at the first element that doesn’t meet the condition in the block.

As in many cases, things become clearer with some examples:

# In Canada, and in fact in all but 2 countries in the world,
# the weather report gives temperatures in Celsius!
temperatures = [28, 25, 30, 22, 27]
=> [28, 25, 30, 22, 27]

# The block returns true for the FIRST two elements,
# and false for the third.
# So drop_while returns an array like the original,
# but starting at the third element.
temperatures.drop_while {|temperature| temperature < 30}
=> [30, 22, 27]

# The block returns false for the first element,
# so drop_while returns an array like the original,
# starting at the first element
# (in other words, a copy of the original).
temperatures.drop_while {|temperature| temperature < 28}
=> [28, 25, 30, 22, 27]

Enumerable#drop_while and Hashes

When used on a hash, drop_while effectively:

  • Creates an array based on the hash, with each element in the hash represented as a two-element array where the first element contains the key and the second element containing the corresponding value, then
  • goes through each element in the array, dropping elements until it encounters the first element that doesn’t meet the condition in the block. The resulting array is an array of two-element arrays, starting at the first element that doesn’t meet the condition in the block.

Once again, examples will make this all clear:

# We're basically taking the array from the previous example
# and dressing it up in a hash
city_temperatures = {"New York" => 28, \
                     "Toronto" => 25, \
                     "Washington" => 30, \
                     "Montreal" => 22, \
                     "Boston" => 27}
=> {"New York"=>28, "Toronto"=>25, "Washington"=>30,
"Montreal"=>22, "Boston"=>27}

# The block returns true for the FIRST two elements,
# and false for the third.
# So drop_while returns an array based on the hash,
# but starting at the third element
# (and, of course, the key-value pairs turned into
# two-element arrays).
city_temperatures.drop_while {|city_temperature| city_temperature[1] < 30}
=> [["Washington", 30], ["Montreal", 22], ["Boston", 27]]

# This is a more readable version of the line above
city_temperatures.drop_while {|city, temperature| temperature < 30}
=> [["Washington", 30], ["Montreal", 22], ["Boston", 27]]

# The block returns false for the first element,
# so drop_while returns an array based on the hash,
# starting at the first element
# (in other words, an array based on the original hash,
# with key-value pairs turned into two-element arrays).
city_temperatures.drop_while {|city,temperature| temperature < 28}
=> [["New York", 28], ["Toronto", 25], ["Washington", 30],
["Montreal", 22], ["Boston", 27]]

Enumerable#drop_while’s Evil Twin, Enumerable#take_while

I’ll cover take_while in detail in a later installment, but for now, an example should suffice:

city_temperatures.drop_while {|city_temperature| city_temperature[1] < 30}
=> [["Washington", 30], ["Montreal", 22], ["Boston", 27]]

city_temperatures.take_while {|city_temperature| city_temperature[1] < 30}
=> [["New York", 28], ["Toronto", 25]]

Categories
Uncategorized

Enumerating Enumerable: Enumerable#detect/Enumerable#find

Here’s another article in the Enumerating Enumerable series, in which I attempt to improve upon RubyDoc.org’s documentation for the Enumerable module, which I find rather lacking. If you’ve missed the previous articles in the series, I’ve listed them below:

  1. all?
  2. any?
  3. collect / map
  4. count
  5. cycle

This installment covers a method that goes by two names: detect or find. I personally prefer find, as it’s shorter and the term I tend to use for its function.

Enumerable#detect/Enumerable#find Quick Summary

Graphic representation of the \"detect\" or \"find\" method in Ruby\'s \"Enumerable\" module.

In the simplest possible terms What’s the first item in the collection that meets the given criteria?
Ruby version 1.8 and 1.9
Expects
  • A block containing the criteria.
  • An optional argument containing a proc that calculates a “default” value — that is, the value to return if no item in the collection matches the criteria.
Returns The first item in the collection that matches the criteria, if one exists.
If no such item exists in the collection, detect/find returns:

  • nil is returned if no argument is provided
  • the value of the argument, if one is provided.
RubyDoc.org’s entry Enumerable#detect / Enumerable#find

Enumerable#detect/Enumerable#find and Arrays

When used on an array without an argument, detect/find passes each item from the collection to the block and…

  • If the current item causes the block to return a value that doesn’t evaluate to false, detect/find stops going through collection and returns the item.
  • If no item in the collection causes the block to return a value that doesn’t evaluate to false, detect/find returns nil.

In the examples that follow, I’ll be using the find method. detect does exactly the same thing; it’s just that I prefer find.

# Time to establish my "old fart" credentials
classic_rock_bands = ["AC/DC", "Black Sabbath", "Queen", \
"Ted Nugent and the Amboy Dukes", "Scorpions", "Van Halen"]
=> ["AC/DC", "Black Sabbath", "Queen", "Ted Nugent and the Amboy Dukes",
"Scorpions", "Van Halen"]

# Of the bands in the array, which is the first one
# whose name is longer than 8 characters?
classic_rock_bands.find {|band| band.length > 8}
=> "Black Sabbath"

# Which is the first one whose name is exactly
# 5 characters long?
classic_rock_bands.find {|band| band.length == 5}
=> "AC/DC"

# The order of items in the array can affect "find"'s result.
# Let's put the array into reverse sorted order:
classic_rock_bands.sort!.reverse!
=> ["Van Halen", "Ted Nugent and the Amboy Dukes", "Scorpions", \
"Queen", "Black Sabbath", "AC/DC"]

# Again: which is the first band whose name
# is longer than 8 characters?
=> "Van Halen"

# Again: which is the first band whose name
# is exactly 5 characters?
=> "Queen"

# If no band in the array meets the criteria,
# "find" returns nil.
# There are no bands in the list with names shorter
# than 5 characters...
classic_rock_bands.find {|band| band.length < 5}
=> nil

Using the optional argument is a topic big enough to merit its own section, which appears later in this article.

Enumerable#detect/Enumerable#find and Hashes

When used on a hash and a block is provided, detect/find passes each key/value pair in the hash to the block, which you can “catch” as either:

  1. A two-element array, with the key as element 0 and its corresponding value as element 1, or
  2. Two separate items, with the key as the first item and its corresponding value as the second item.

When used on a hash without an argument, detect/find passes each item from the collection to the block and…

  • If the current item causes the block to return a value that doesn’t evaluate to false, detect/find stops going through collection and returns the item.
  • If no item in the collection causes the block to return a value that doesn’t evaluate to false, detect/find returns nil.

years_founded = {"AC/DC" => 1973, \
                 "Black Sabbath" => 1968, \
                 "Queen" => 1970, \
                 "Ted Nugent and the Amboy Dukes" => 1967, \
                 "Scorpions" => 1965, \
                 "Van Halen" => 1972}
# Ruby 1.8 re-orders hashes in some mysterious way that is almost never
# the way you entered it. Here's what I got in Ruby 1.8:
=> {"Queen"=>1970, "AC/DC"=>1973, "Black Sabbath"=>1968, "Scorpions"=>1965,
"Ted Nugent and the Amboy Dukes"=>1967, "Van Halen"=>1972}

# Ruby 1.9 preserves hash order so that hashes keep the order in which
# you defined them. Here's what I got in Ruby 1.9:
=> {"AC/DC"=>1973, "Black Sabbath"=>1968, "Queen"=>1970,
"Ted Nugent and the Amboy Dukes"=>1967, "Scorpions"=>1965, "Van Halen"=>1972}

# Which band is the first in the hash to be founded
# in 1970 or later?
years_founded.find {|band| band[1] >= 1970}
# In Ruby 1.8, the result is:
=> ["Queen", 1970]
# In Ruby 1.9, the result is:
=> ["AC/DC", 1973]

# Here's another way of phrasing it:
years_founded.find {|band, year_founded| year_founded >= 1970}

Using Enumerable#detect/Enumerable#find with the Optional Argument

detect/find‘s optional argument lets you specify a proc or lambda whose return value will be the result in cases where no object in the collection matches the criteria.

(Unfortunately, a complete discussion of procs and lambdas is beyond the scope of this article. I highly recommend looking at Eli Bendersky’s very informative article, Understanding Ruby blocks, Procs and methods.)

I think that the optional argument is best explained through examples…

# Once again, the array of classic rock bands
classic_rock_bands = ["AC/DC", "Black Sabbath", "Queen", \
"Ted Nugent and the Amboy Dukes", "Scorpions", "Van Halen"]

# Let's define a proc that will simply returns the band name
# "ABBA", who are most definitely not considered to be
# a classic rock band.
default_band = Proc.new {"ABBA"}

# Procs are objects, so using a proc's name alone isn't sufficient
to invoke its code -- doing so will simply return the proc object.
default_band
=> #<Proc:0x00553f34@(irb):31>
# (The actual value will be different for you, but you get the idea.)

# To call a proc, you have to use its "call" method:
default_band.call
=> "ABBA"

# What we want to do is use "default_band" as a proc that
# provides a default value in the event that detect/find doesn't
# find a value that matches the critera.

# detect/find calls the "call" method of the object you provide
# as the argument if no item in the collection matches the
# criteria in the block.

# There *is* a band in the array that comes after
# "Led Zeppelin" alphabetically: Queen.
classic_rock_bands.find(default_band) {|band| band > "Led Zeppelin"}
=> "Queen"

# The last band in the array, alphabetically speaking,
# is Van Halen. So if we ask detect/find to find a band that
# comes after Van Halen, it won't find one.
# Without the argument, detect/find returns nil,
# but *with* the argument, it will invoke the "call"
# method of the object you provide it.
classic_rock_bands.find(default_band) {|band| band > "Van Halen"}
=> "ABBA"

# Let's try something a little fancier. This time, we'll use a lambda.
# The differences between procs and lambdas are very fine -- I suggest
# you check Eli Bendersky's article for those differences.

# Let's create a lambda that when called,
# returns the name of a randomly-selected pop singer.
random_band = lambda do
    fallback_bands = ["Britney Spears", "Christina Aguilera", "Ashlee Simpson"]
    fallback_bands[rand(fallback_bands.size)]
end

# Let's give it a try...
classic_rock_bands.find(random_band) {|band| band > "Van Halen"}
=> "Britney Spears"

classic_rock_bands.find(random_band) {|band| band > "Van Halen"}
=> "Ashlee Simpson"

classic_rock_bands.find(random_band) {|band| band > "Van Halen"}
=> "Christina Aguilera"

classic_rock_bands.find(random_band) {|band| band > "Led Zeppelin"}
=> "Queen"

To see a “real” application of detect/find's optional argument, see this Ruby Quiz problem.

Categories
Uncategorized

Enumerating Enumerable: Enumerable#count

Welcome to the fourth installment of Enumerating Enumerable, a series of articles in which I challenge myself to do a better job of documenting Ruby’s Enumerable module than RubyDoc.org does. In this article, I’ll cover Enumerable#count, one of the new methods added to Enumerable in Ruby 1.9.

In case you missed the earlier installments, they’re listed (and linked) below:

  1. all?
  2. any?
  3. collect / map

Enumerable#count Quick Summary

Graphic representation of the Enumberable#count method in Ruby

In the simplest possible terms How many items in the collection meet the given criteria?
Ruby version 1.9 only
Expects Either:

  • An argument to be matched against the items in the collection
  • A block containing an expression to test the items in the collection
Returns The number of items in the collection that meet the given criteria.
RubyDoc.org’s entry Enumerable#count

Enumerable#count and Arrays

When used on an array and an argument is provided, count returns the number of times the value of the argument appears in the array:

# How many instances of "zoom" are there in the array?
["zoom", "schwartz", "profigliano", "zoom"].count("zoom")
=> 2

# Prior to Ruby 1.9, you'd have to use this equivalent code:
["zoom", "schwartz", "profigliano", "zoom"].select {|word| word == "zoom"}.size
=> 2

When used on an array and a block is provided, count returns the number of items in the array for which the block returns true:

# How many members of "The Mosquitoes" (a Beatles-esque band that appeared on
# "Gilligan's Island") have names following the "B*ngo" format?
["Bingo", "Bango", "Bongo", "Irving"].count {|bandmate| bandmate =~ /B[a-z]ngo/}
=> 3

# Prior to Ruby 1.9, you'd have to use this equivalent code:
["Bingo", "Bango", "Bongo", "Irving"].select {|bandmate| bandmate =~ /B[a-z]ngo/}.size

RubyDoc.org says that when count is used on an array without an argument or a block, it simply returns the number of items in the array (which is what the length/size methods do). However, when I’ve tried it in irb and ruby, I got results like this:

[1, 2, 3, 4].count
=> #<Enumerable::Enumerator:0x189d784>

Enumerable#count and Hashes

As with arrays, when used on a hash and an argument is provided, count returns the number of times the value of the argument appears in the hash. The difference is that for the comparison, each key/value pair is treated as a two-element array, with the key being element 0 and the value being element 1.

# Here's a hash where the names of recent movies are keys
# and their metacritic.com ratings are the corresponding values.
movie_ratings = {"Get Smart" => 53, "Kung Fu Panda" => 88, "The Love Guru" => 15,
"Sex and the City" => 51, "Iron Man" => 93}
=> {"Get Smart"=>53, "Kung Fu Panda"=>88, "The Love Guru"=>15, "Sex and the City"=>51, "Iron Man"=>93}

# This statement will return a count of 0, since there is no item in movie_ratings
# that's just plain "Iron Man".
movie_ratings.count("Iron Man")
=> 0

# This statement will return a count of 1, since there is an item in movie_ratings
# with the key "Iron Man" and the corresponding value 93.
movie_ratings.count(["Iron Man", 93])
=> 1

# This statement will return a count of 0. There's an item in movie_ratings
# with the key "Iron Man", but its corresponding value is NOT 92.
movie_ratings.count(["Iron Man", 92])
=> 0

count is not useful when used with a hash and an argument. It will only ever return two values:

  • 1 if the argument is a two-element array and there is an item in the hash whose key matches element [0] of the array and whose value matches element [1] of the array.
  • 0 for all other cases.

When used with a hash and a block, count is more useful. count passes each key/value pair in the hash to the block, which you can “catch” as either:

  1. A two-element array, with the key as element 0 and its corresponding value as element 1, or
  2. Two separate items, with the key as the first item and its corresponding value as the second item.

Each key/value pair is passed to the block and count returns the number of items in the hash for which the block returns true.

# Once again, a hash where the names of recent movies are keys
# and their metacritic.com ratings are the corresponding values.
movie_ratings = {"Get Smart" => 53, "Kung Fu Panda" => 88, "The Love Guru" => 15,
 "Sex and the City" => 51, "Iron Man" => 93}
=> {"Get Smart"=>53, "Kung Fu Panda"=>88, "The Love Guru"=>15, "Sex and the City"=>51, "Iron Man"=>93}

# How many movie titles in the collection start
# in the first half of the alphabet?
# (Using a one-parameter block)
movie_ratings.count {|movie| movie[0] <= "M"}
=> 3

# How many movie titles in the collection start
# in the first half of the alphabet?
# (This time using a two-parameter block)
movie_ratings.count {|title, rating| title <= "M"}
=> 3

# Here's how you'd do it in pre-1.9 Ruby:
movie_ratings.select {|movie| movie[0] <= "M"}.size
=> 3
# or...
movie_ratings.select {|title, rating| title <= "M"}.size
=> 3

# How many movies in the collection had a rating
# higher than 80?
# (Using a one-parameter block)
movie_ratings.count {|movie| movie[1] > 80}
=> 2

# How many movies in the collection had a rating
# higher than 80?
# (This time using a two-parameter block)
movie_ratings.count {|title, rating| rating > 80}
=> 2

# Here's how you'd do it in pre-1.9 Ruby:
movie_ratings.select {|title, rating| rating > 80}.size
=> 2


# How many movies in the collection have both:
# - A title starting in the second half of the alphabet?
# - A rating less than 50?
# (Using a one-parameter block)
movie_ratings.count {|movie| movie[0] >= "M" && movie[1] < 50}
=> 1

# How many movies in the collection have both:
# - A title starting in the second half of the alphabet?
# - A rating less than 50?
# (This time using a two-parameter block)
movie_ratings.count {|title, rating| title >= "M" && rating < 50}
=> 1

# Here's how you'd do it in pre-1.9 Ruby:
movie_ratings.select {|title, rating| title >= "M" && rating < 50}.size
=> 1

(You should probably skip The Love Guru completely, or at least until it gets aired on TV for free.)