Categories
Artificial Intelligence Math Programming

You want sum of this? (or: What does the Σ symbol mean?)

If you’ve been perusing LinkedIn or a programming site like Lobste.rs, you may have seen that the professors who teach Stanford’s machine learning course, CS229, have posted their lecture notes online, a whopping 226 pages of them! This is pure gold for anyone who wants to get up to speed on machine learning but doesn’t have the time — or $55K a year — to spend on getting a Bachelor’s computer science degree from “The Cardinal.”

Or at least, it seems like pure gold…until you start reading it. Here’s page 1 of Chapter 1:

This is the sort of material that sends people running away screaming. For many, the first reaction upon being confronted with it would be something like “What is this ℝ thing in the second paragraph? What’s with the formulas on the first page? What the hell is that Σ thing? This is programming…nobody told me there would be math!”

If you’re planning to really get into AI programming and take great pains to avoid mathematics, I have good news and bad news for you.

First, the bad news: A lot of AI involves “college-level” math. There’s linear algebra, continuous functions, statistics, and a dash of calculus. It can’t be helped — machine learning and data science are at the root of the way artificial intelligence is currently being implemented, and both involve number-crunching.

And now, the good news: I’m here to help! I’m decent at both math and explaining things.

Over the next little while, I’m going to post articles in a series called Math WTF that will explain the math that you might encounter while learning AI and doing programming. I’m going to keep it as layperson-friendly as possible, and in the end, you’ll find yourself understanding stuff like the page I posted above.

So welcome to the first article in the Math WTF series, where I’ll explain something you’re likely to run into when reading notes or papers on AI and data science: the Σ symbol.

Σ, or sigma

As explained in the infographic above, the letter Σ — called “sigma” — is the Greek equivalent of our letter S. It means “the sum of a series.”

The series in question is determined by the things above, below, and to the right of the Σ:

  • The thing to the right of the Σ describes each term in the series: 2n + 3, or as we’d say in code, 2 * n + 3.
  • The thing below the Σ specifies the index variable — the variable we’ll use for counting terms in the series (which in this case is n) — and its initial value (which in this case is 1).
  • The thing above the Σ specifies the final value of the index variable, which in this case is 4.

So you can read the equation pictured above as “The sum of all the values of 2n + 3, starting at n = 1 and ending with n = 4.”

If you write out this sum one term at a time, starting with n = 1 and ending with n = 4, you get this…

((2 * 1) + 3) + ((2 * 2) + 3) + ((2 * 3) + 3) + ((2 * 4) + 3)

…and the answer is 32.

You could express this calculation in Python this way…

# Python 3.11

total = 0
for n in range(1, 5):
    total += 2 * n + 3

Keep in mind that range(1, 5) means “a range of integers starting at 1 and going up but not including 5.” In other words, it means “1, 2, 3, 4.”

There’s a more Pythonic way to do it:

# Python 3.11

sum([2 * n + 3 for n in range(1, 5)])

This is fine if you need to find the sum of a small set of terms. In this case, we’re looking at a sum of 4 terms, so generating a list and then using the sum function on it is fine. But if we were dealing with a large set of terms — say tens of thousands, hundreds of thousands, or more — you might want to go with a generator instead:

# Python 3.11

sum((2 * n + 3 for n in range(1, 5)))

The difference is the brackets:

  • [2 * n + 3 for n in range(1, 5)] — note the square brackets on the outside. This creates a list of 4 items. Creating 4 items doesn’t take up much processing time or memory, but creating hundreds of thousands could.
  • (2 * n + 3 for n in range(1, 5)) — note the round brackets on the outside. This creates a generator that can be called repeatedly, creating the next item in the sequence each time that generator is called. This takes up very little memory, even when going through a sequence of millions, billions, or even trillions of terms.

Keep an eye on this blog! I’ll post more articles explaining math stuff regularly.

Worth reading

For more about generators in Python, see Real Python’s article, How to Use Generators and yield in Python.