Web 2.0 Summit: Riya Launches

Facial recognition? Feh! Try fashion recognition. I suppose the idea is that the latter tack will aid in some financial recognition for Riya.

Visual search engine Riya has used the Web 2.0 Summit to launch their service, a visually-driven shopping search engine. The premise is pretty simple: when you see something you like (say, a handbag) does its best to find similar items. There are a number of ways into the search: you text search for something (“bag,” or “D&G“), you can browse their categories (for example, watches) and find an example of an item you like, or you can browse through their celebrity pictures (if I wanted to take my style cues from my virtual identical twin, Brad Pitt). currently offers a few ways to refine those searches once you get closer to what you're looking for (such as focusing on the details that really matter, or the color of the item). It's nifty, as it stands, but not a home run. The way Riya's talking about their future plans, however, gives me some hope for their prospects. Dan “Between the Lines” Farber has more: offers several search capabilities, including the ability to search by image instead of text; finds items that have specific features, such as a watch bezel; find color variants of the item via a color picker; find clothing, shoes and accessories similar to those worn by your celebrities ( includes 100,000 celebrity images); and in the near future the ability to upload photos. will also have a browser extension to initiate likeness searches from any site as well as pages to save searches and a recommendation engine. After launch will also have a cross-matching feature. “If you have a hat and want shirt to with it, you drag a slider and search on new category,” Shah said.

The keys for me: matched cross-selling (ie, “Show me shoes to go with that handbag”) and the ability to initiate likeness searches from any page on any site. Far more than image upload, that seems critical.

The user case for this kind of shopping has been around for a long time. I remember when Time Warner was conducting their interactive TV trials in Orlando back in the early '90s, one of the tired examples of interactive multimedia commerce that constantly got trotted out was the ability to freeze the show you were watching and highlight any item in, say, Jerry Seinfeld's apartment and go shopping for it. The ability to take advantage of serendipity, impulse, and context will be important to's success.

I'm starting to speculate now (and I have no basis for this other than it seems pretty obvious to me), but another logical avenue for Riya to pursue, in addition to the browser plug-in, would be affliliate relationships with media outlets like People, Gawker, E! Online, etc. Those sites could benefit financially from driving buyers to the merchants in's stable, and would gain both relevant content and wide distribution for their search engine. The merchants, of course, would get the traffic. The celebrity news sites could provide with a properly-structured and tagged image feed, allowing to keep their index relevant and fresh.

This is all possible because Riya has taken the hard road of automating visual search, as opposed to relying on human-supplied metadata (and as my friend John Henson was fond of saying around the Opencola offices “Real data is better than metadata.”). Mike “TechCrunch” Arrington notes how Riya's approach, difficult though it is, is unique:

There are lots of other image search engines on the web today. But all of them only take queries as text, and compare those text queries to the meta data attached to an image file. This data is notoriously thin, and companies like Google are resorting to using human labor to attempt to add descriptive keywords to images stored on their servers. Even specialty image search engines like Pixsy have fairly thin meta data for images. And all of the existing search engines allow only text for search queries.

The engine takes both text and images as queries, something no one else does. To return results based on an image query, compares a “visual signature” for the query image to possible results. The visual signature is simply a mathematical representatioin of the image using 10,000 variables. If enough variables are identical, decides the images are similar.

How much heavy lifting is involved? I throw to Farber for the facts:

The core technology is even more complex than face recognition technology, Shah said. crawls target merchant sites and retrieves the highest quality images. It takes about 20 seconds per image to preprocess, creating a visual signature and indexing the image.

Search results are returned in under a second–the server farm consists of 250 quad-core servers, each loaded with 16 to 32 gigabytes of memory. converts every picture into a visual signature, a 10-kilobyte vector image consisting of about 5,000 numbers. The “likeness” algorithm determines the order of results based on shape, color and pattern.

“We are extracting and computing the visual signatures and pulling out pieces for comparison,” Shah said. “The results will never be worse than a text search. We index all the metadata and even normalized some of it.”  Currently, only indexes the merchant sites.The soft goods vector images are more detailed than faces, which are encoded as 3-kilobyte vectors, and include about 40 elements, including shape densities, color histograms broken into quadrants and other properties, such as glossiness and sheen (analyzing color changes in the middle of objects).

So the carries roughly 3X the data per item than Riya does for their facial recognition search.

Rob “Scobleizer” Scoble shares a few interesting facts about Riya's new service:

1) The URL cost $100,000. In the interview [Riya CEO Munjal Shah] explains how they bought it. It involved finding the guy who owned it, jumping a fence, and leaving a bottle of wine with a note on it (he wouldn’t answer his email).
2) Riya was pretty close to being sold to Google. If it had been, they never would have worked on this search engine. So, by getting turned down by Google Riya came back with a much better business.
3) Just the jewelry set takes 20GB of RAM.
4) Munjal still believes in blogs, but for this launch Riya talked with fashion bloggers, and journalists outside the tech world like at People magazine. Why? Well, this site — in its current incarnation — will be most interesting to women and non-geeks. If you’ve looked at who participates here, it’s heavily male.
5) Why not keep working on face detection? Because they learned through user testing that they’d never be able to make it good enough. They found that by focusing on visual image searches they can get a much more satisfied user base.


Tags: , , , , ,