Categories
Uncategorized

Pipes Explained

“Pipes” vs. “Series of Tubes”: Not the Same Thing!

Yahoo! PipesTo the layperson, the name of Yahoo!’s new service, Pipes, might seem like a reference to Senator Ted Stevens’ ridiculous “a series of tubes” metaphor for the internet. The name is actually a reference to something a little older: a feature in Unix (and its spin-offs, including OS X and various flavours of Linux) called pipes, and their kissing cousin, filters.

That Funny “|” Key on Your Keyboard, Explained at Last!

Diagram showing the pipe/backslash key on a computer keyboard

If you look to the upper right-hand corner of the main part of most computer keyboards, you should see the backspace key. Just below that is the backslash (\) key, which produces a vertical bar when you press it in conjunction with the shift key (this key is pictured to the right). That vertical bar is called the pipe character, and it’s a useful tool in the Unix command line.

Small Pieces, Loosely Joined

“Small pieces loosely joined” is a phrase that explains the organizing principle of the World Wide Web quite well, which is why David Weinberger chose it for the title of his book on the web. Since the origins of the internet and the web are Unix-y, it should come as no surprise that “small pieces loosely joined” also explains the organizing principle of the Unix operating system. Rather than provide large, monolithic systems for managing files and programs, Unix provides a series of what I like to call “small, sharp tools” that can be combined in a Lego-like fashion in whatever way its users see fit. To techies, programmers and tinkers, this approach makes Unix both workbench and playground, and the flexibility provided by this approach has led to all sorts of interesting and unforseen applications and tools.

If Unix commands and tools are the small pieces, pipes and filters are the way by which they are loosely joined. Pipes allow you to direct the output of one Unix program to another Unix program, whose output in turn can be “piped” to yet another program. Filters are programs that are used to process text; as such, program output is often piped through them to either filter out unwanted data or to augment the data passing through them.

A Little Example Demonstrating the Power of Pipes

I don’t want to bog you down with a complete Unix training course, but I do want to give you the sense of how pipes are used. Those of you with on Macs or Linux machines should feel free to open up a terminal application and follow my examples on the command line.

Suppose I have a text file in my home directory called series-of-tubes.txt that contains some of the choice bits of Senator Ted Stevens’ statements. I can use the Unix command cat to print out its contents. If I type

cat series-of-tubes.txt

at the prompt, I’ll get this in response:

Ten movies streaming across that, that Internet, and what happens to your own personal Internet? I just the other day got... an Internet was sent by my staff at 10 o'clock in the morning on Friday, I got it yesterday. Why? Because it got tangled up with all these things going on the Internet commercially. [...] They want to deliver vast amounts of information over the Internet. And again, the Internet is not something you just dump something on. It's not a big truck. It's a series of tubes. And if you don't understand those tubes can be filled and if they are filled, when you put your message in, it gets in line and it's going to be delayed by anyone that puts into that tube enormous amounts of material, enormous amounts of material.

Let’s say I want a word count of that text file. It’s easy to do by piping the output of the cat command through another Unix command called wc, which is short for word count. If I were to type this at the command line:

cat series-of-tubes.txt | wc

I would get this as output:

1 135 744

That’s the wc command telling me that the file series-of-tubes.txt has one line (I entered the text as one big line without carriage returns), 135 words and 744 characters.

Now let’s suppose I wanted to takes those line/words/characters statistics and mail them somewhere for analysis at a later time. Well, that’s a matter of piping the output of the cat command through the wc command to get the line/word/character count, and then piping the wc command’s output through the mail command. In Unix, I can very easily do this by typing the following at the command line:

cat series-of-tubes.txt | wc | mail -s "Stevens Stats" joey@globalnerdy.com

The parameters that follow the mail command state that the mail should have the subject line Stevens Stats and that it should be sent to the email address joey@globalnerdy.com. A quick look at my Gmail account confirms thatmy piped commands did the job:

Gmail, showing the result of the piped unix command

Not bad for a one-line command, eh?

I could go further. Suppose the text file was being updated regularly and I wanted a daily email update of its line/word/character count stats. This can be accomplished by sticking the commands I just typed into a shell script and then setting up a cron job to automatically run that script every day at 10:29 a.m., just in time for my ritual 10:30 a.m. email check. This is the sort of power that the organizing principle of “small pieces loosely joined” gives you.

Pipes for the Web

In the example above, I showed what was possible just by using pipes to string Unix commands together. Imagine this sort of principle — small pieces, loosely joined — on an internet scale, with data online data sources as the “small pieces”. What does the loose joining?

Enter Yahoo! Pipes.

(Actually, I can’t enter Yahoo! Pipes at the moment. The tech world is so abuzz about this new tech that their site’s completely bogged down. Hopefully, this’ll clear up soon as they throw more servers and engineers at the problem.)

I’ll leave it to Tim O’Reilly to describe Yahoo! Pipes:

It’s a service that generalizes the idea of the mashup, providing a drag and drop editor that allows you to connect internet data sources, process them, and redirect the output. Yahoo! describes it as “an interactive feed aggregator and manipulator” that allows you to “create feeds that are more powerful, useful and relevant.” While it’s still a bit rough around the edges, it has enormous promise in turning the web into a programmable environment for everyone.

There is one important difference between Yahoo! Pipes and those of the Unix variety: while Unix pipes were made with programmers, sysadmins and tech tinkerers in mind, Yahoo! Pipes are made to be more user friendly. While you’ll still need a tiny bit of tech savvy to use Pipes, the user interface, which allows you to visually hook up pieces of code that provide an API significantly lower the barrier to entry for creating applications — you no longer have to be coder!

For example, take a look at this web app created with Pipes that aggregates news alerts from several sources and provides a search box that lets you look for news items with specific keywords. Until now, you’d have to be adept at a programming language and its internet and XML-processing libraries to build such an app. Now, you can do it with Pipes (click here to see it at full size):

Example web app written with Pipes

This is pretty exciting stuff. I’m sure you’ll see more written about it by all sorts of folks, including George and me.

Links: