Categories
Uncategorized

Enumerating Enumerable: Enumerable#group_by

Enumerating Enumerable

Once again, it’s Enumerating Enumerable time! This is the latest in my series of articles where I set out to make better documentation for Ruby’s Enumerable module than Ruby-Doc.org’s. In this installment — the seventeenth in the series — I cover the group_by method.

In case you missed any of the previous articles, they’re listed and linked below:

  1. all?
  2. any?
  3. collect / map
  4. count
  5. cycle
  6. detect / find
  7. drop
  8. drop_while
  9. each_cons
  10. each_slice
  11. each_with_index
  12. entries / to_a
  13. find_all / select
  14. find_index
  15. first
  16. grep

Enumerable#group_by Quick Summary

Graphic representation of the "group_by" method in Ruby's "Enumerable" module.

In the simplest possible terms Break a collection into groups based on some given criteria.
Ruby version 1.9 only
Expects A block containing the criteria by which the items in the collection will be grouped.
Returns A hash where each key represents a group. Each key’s corresponding value is an array containing the members of that group.
RubyDoc.org’s entry Enumerable#group_by

Enumerable#group_by and Arrays

When used on an array, group_by iterates through the array, passing each element to to the block. The result value of the block is the group into which the element will be placed.

Example 1

For the first example, I’ll use some code similar to the example given in Ruby-doc.org’s writeup of group_by:

(0..15).group_by {|number| number % 3}
=> {0=>[0, 3, 6, 9, 12, 15], 1=>[1, 4, 7, 10, 13], 2=>[2, 5, 8, 11, 14]}

In the code above, the numbers 0 through 15 are passed to the block, which receives each number as the parameter number. The group that each number is placed into is determined by the result value of the block, number % 3, whose result can be one of 0, 1 or 2. This means that:

  • The resulting hash will have three groups, represented by the keys 0, 1 and 2
  • The key 0‘s corresponding value is an array containing the numbers in the range (0..15) that are evenly divisible by 3 (i.e. the numbers for which number % 3 is 0.
  • The key 1‘s corresponding value is an array containing the numbers in the range (0..15) that when divided by 3 leave a remainder of 1 (i.e. the numbers for which number % 3 is 1.
  • The key 2‘s corresponding value is an array containing the numbers in the range (0..15) that when divided by 3 leave a remainder of 2 (i.e. the numbers for which number % 3 is 2.

Example 2

In the first example, the keys in the resulting hash are the same type as the values in the array whose contents we’re grouping. In this example, I’ll show that the keys in the resulting hash don’t have to be the same type as the values in the array.

simpsons = %w(Homer Marge Bart Lisa Abraham Herb)
=> ["Homer", "Marge", "Bart", "Lisa", "Abraham", "Herb"]

simpsons.group_by{|simpson| simpson.length}
=> {5=>["Homer", "Marge"], 4=>["Bart", "Lisa", "Herb"], 7=>["Abraham"]}

In the code above, each Simpson name is passed to the block, which receives it as the parameter simpson. The block’s result is the length of simpson, and this result is the group into which the name will go.

In the resulting hash:

  • Note that the keys are integers while the names in the groups are strings.
  • The key 5‘s array contains those names in Simpsons that are 5
    characters in length.
  • The key 4‘s array contains those names in Simpsons that are 4 characters in length.
  • The key 7‘s array contains those names in Simpsons that are 7 characters in length.

Example 3

In the previous two examples, the keys for the resulting array were calculated from the values in the initial array. In this example, I’ll demonstrate that the keys for the groupings can be determined in a completely arbitrary fashion that has nothing to do with the values:

# Put the Simpsons into randomly determined groups
simpsons.group_by{rand(3) + 1}
=> {3=>["Homer", "Bart", "Abraham", "Herb"], 1=>["Marge", "Lisa"]}

# Let's try that again. The results are very likely to be different:
simpsons.group_by{rand(3) + 1}
=> {1=>["Homer", "Bart"], 2=>["Marge", "Lisa", "Herb"], 3=>["Abraham"]}

# One more time!
simpsons.group_by{rand(3) + 1}
=> {2=>["Homer", "Bart", "Lisa"], 3=>["Marge", "Herb"], 1=>["Abraham"]}

Enumerable#group_by and Hashes

When used on a hash, group_by passes each key/value pair in the hash to the block, which you can “catch” as either:

1. A two-element array, with the key as element 0 and its corresponding value as element 1, or
2. Two separate items, with the key as the first item and its corresponding value as the second item.

Example 1

In this example, we’ll group the cast of Family Guy by the item that they’re bringing to a potluck dinner:

potluck = {"Peter" => "lasagna",
           "Lois"  => "potato salad",
           "Chris" => "lasagna",
           "Meg"   => "brownies",
           "Stewie" => "chateaubriand",
           "Brian" => "potato salad",
           "Evil Monkey" => "potato salad"}
=> {"Peter"=>"lasagna", "Lois"=>"potato salad", "Chris"=>"lasagna", "Meg"=>"brownies",
"Stewie"=>"chateaubriand", "Brian"=>"potato salad", "Evil Monkey"=>"potato salad"}

# Here's one way to do it:
potluck.group_by{|person, bringing| bringing}
=> {"lasagna"=>[["Peter", "lasagna"], ["Chris", "lasagna"]], "potato salad"=>[["Lois", "potato salad"],
["Brian", "potato salad"], ["Evil Monkey", "potato salad"]], "brownies"=>[["Meg", "brownies"]],
"chateaubriand"=>[["Stewie", "chateaubriand"]]}

# Here's another way to do it:
potluck.group_by{|person| person[1]}
=> {"lasagna"=>[["Peter", "lasagna"], ["Chris", "lasagna"]], "potato salad"=>[["Lois", "potato salad"],
["Brian", "potato salad"], ["Evil Monkey", "potato salad"]], "brownies"=>[["Meg", "brownies"]],
"chateaubriand"=>[["Stewie", "chateaubriand"]]}

Example 2

In the previous example, the groupings were based on a calculation performed on the objects in the original hash. In this example, the groupings will be random: a random number generator will determine whose car each potluck attendee will ride to the potluck dinner:

potluck.group_by {[:peters_car, :quagmires_car, :clevelands_car][rand(3)]}
=> {:peters_car=>[["Peter", "lasagna"], ["Chris", "lasagna"], ["Evil Monkey", "potato salad"]],
:quagmires_car=>[["Lois", "potato salad"], ["Meg", "brownies"], ["Stewie", "chateaubriand"]],
:clevelands_car=>[["Brian", "potato salad"]]}

# Let's try another random grouping
potluck.group_by {[:peters_car, :quagmires_car, :clevelands_car][rand(3)]}
=> {:peters_car=>[["Peter", "lasagna"], ["Meg", "brownies"]], :quagmires_car=>[["Lois", "potato salad"],
["Stewie", "chateaubriand"], ["Brian", "potato salad"], ["Evil Monkey", "potato salad"]],
:clevelands_car=>[["Chris", "lasagna"]]}

# One more time!
potluck.group_by {[:peters_car, :quagmires_car, :clevelands_car][rand(3)]}
=> {:peters_car=>[["Peter", "lasagna"], ["Chris", "lasagna"], ["Stewie", "chateaubriand"]],
:quagmires_car=>[["Lois", "potato salad"], ["Evil Monkey", "potato salad"]], :clevelands_car=>[["Meg", "brownies"],
["Brian", "potato salad"]]}

One reply on “Enumerating Enumerable: Enumerable#group_by”

Comments are closed.