Sourced From Search Engine Watch


Google Base, which some people got an early glimpse at last month, is now live (or supposed to be) for anyone to play with (though it has been down a few times, and already had one security threat patched). So what’s in it, and what’s it all mean?

Adding Items

Google Base allows people to list anything they want. Want to create an entry for a recipe item? An event listing? A job posting? A book review? You name it, you can pretty much post it.

By default, some existing item types are suggested: Course Schedules, Events & Activities, Jobs, News & Articles, People Profiles, Products, Reference Articles, Reviews, Services, Vehicles, Wanted Ads.

Select any existing item type, and you’re sent to a form to fill out with some fields already set that you may want to use. For instance, the Product form has areas to enter a price, quantity, brand, condition and payment options. Did the form miss something? You can easily add new fields.

None of the existing item types work for you? Then you’ve got an option to create your own custom item type.

Not The Killer Of eBay & Gang

Is this the eBay-killer, Monster-killer, Craigslist-killer that some expect? Maybe over time, but certainly not now.

Let’s take eBay as one example. I’ve bought plenty from eBay, and a key reason I’ve used the service is the community that surrounds it. There are rules, plus buyers and sellers evaluate each other. It’s easy to decide if I want to risk purchasing something from a seller, based on their ratings. Google Base lacks any such functionality for the moment. Potentially, it could come — but it’s not there yet.

How about Monster? OK, I don’t do any hiring, but the disadvantage of Google Base for job search is immediate. Looking for a job? Google Base gives you one single box — that’s it. Perhaps entering something like advertising jobs in new york city will ultimately work to get you a listing of all jobs appropriate to that. But it might come back with false matches, as well, since no one is required to use predefined forms or a predefined category structure. One example of this was where Google Base brought back a matching job for a model agency because the job description mentions that models do print advertising work.

In contrast, look at Monster. It’s more user friendly to start. While searchers generally don’t make use of search options and drop-down boxes, I think that vertical search engines are an exception. Monster asks you to be specific. What type of job do you want, one drop down box asks (Advertising/Marketing/Public Relations). In the next box, where do you want the job (New York City). The results that come back after making these two choices look good and really shouldn’t have any false positives.

Details, Labels & Structured Data

Any item has both “details” and “labels” associated with it. Both are a way to tag or categorize content, but they don’t operate quite the same way.

Details are really meant for structured data, delimited data, fielded data, data organized in a way so that you can sort easily. For those not familiar with databases, let me step it down to the spreadsheet/table level:

Title Author Rating
Pandora’s Star Hamilton Great
1634: The Galileo Affair Flint Bad
The Search Battelle Great
Dune: The Battle Of Corrin Herbert Good

The above recounts some recent books I’ve read. Each book is listed on its own row, with the title, author and my own rating of the book put into different columns. In Google terms, the headers above each column — the words in bold — those “details.” The rows are items. So if I’m listing a book “item,” then I’d have a details for each book, the title, author and rating.

The advantage to storing information in a tabular model like this is that if you want to, down the line, it’s easy to sort and filter information. Want to see all the books with a “Great” rating? Since I have a Rating detail, you know exactly where to look and can get back just those matches.

Now look at this:

Title Author Rating Keywords
Pandora’s Star Hamilton Great sci-fi, trilogy
1634: The Galileo Affair Flint Bad alternative history, sci-fi
The Search Battelle Great google, yahoo, overture, altavista, search history
Dune: The Battle Of Corrin Herbert Good sci-fi, dune

See that keywords column? That’s the equivalent of “labels” with Google Base. In that column, I’ve just tossed in some words that may help me keyword search and locate the information. But because I’ve put multiple words per column, there’s no way to sort or filter.

Google’s Salar Kamangar, vice president, product management, put it this way:

“Labels are what something’s about, attributes have values.”

Why get into all this detail about details and labels at all? To help explain why this is sort of Google getting into tagging — and not — plus some of the invisible web advantages it offers.

It’s Tagging, But Old School Tagging

Let’s take tagging first. Tagging is a way for people to self categorize content for their own use or to share with others. For example, Yahoo’s My Web system lets you save web pages under whatever tag you want to assign to them, making it easy for you later find all the stories you’ve saved or share stories tagged with that particular word with others.

Google uses the term labels, but to a limited degree, Google Base could be used similar to My Web and other tagging systems. Anyone can save anything. So if you wanted to save all the stories you’ve found about Google and label them “google,” have at it. But unlike proper tagging systems, there’s no easy way for everyone to then find the stories you and others have saved.

That’s because Google Base is all keyword driven. Search for Google, and you’ll get back EVERYTHING that matches that word, not just news items. OK, after your search results appear, you can start to narrow things down in various (and unpredictable) ways. But it’s not anywhere as easy as with a proper tagging system.

But what about that other type of tagging, meta tagging — you know, things like the meta keywords tag that let you associate words with pages, to better describe them. The tag itself might not be back, but the principle has come to Google Base with a vengeance.

Labels let you describe what something is about. Want to list an actual web page in Google? No problem! You can enter a URL, assign it some words, and away you go.

How heavily labels will be used as part of the ranking process remains to be seen. But I already have seen enough to find the entire thing a big giant step backwards.

What’s a page about? Isn’t analyzing every word on the page better than depending on hand categorization? How about using some of the copious amounts of technology that exist to break down a document to main keywords or themes? Going to labels isn’t Search 2.0 or Search 3.0, it’s flipping backwards to Search 1.0.

It’s cool that a page you import can also be classified using details so that you know who authored it, exactly when it was written and so on. Many people want that type of information and ability to sort that way. But Google could get that information automatically by simply supporting some of the Dublin Core meta tags that have been ignored for years. Embedded into web pages, Google would gather up important details rather than having to hope some site author makes the effort to add them via Google Base.

Something like that possibly could happen, however, down the line:

“If we wanted to be more aggressive about labeling, we could accept meta tags progressively,” Kamangar said.

Of course, you can do imports of data, so anyone doing an XML feed should be able to easy import into Google Base the latest stories they’ve written, along with useful meta data about it. But even then, since Google already finds XML feeds, shouldn’t it be smart enough to do the unification without human intervention.

I’ve got no doubt we’re about to see a significant number of site owners start submitting and tagging their information in Google Base, in hopes they’ll do better with Google itself. I suspect the result will be a lot of waste time and Google Base getting overrun with spam. But perhaps I’m wrong, and time will tell.

By the way, why not just call labels tags and go with a term that many commonly understand, regardless if whether we’re talking old school or new school tagging?

“We thought about that, but we didn’t have any religion about one or the other. With Gmail we say labels, and we thought we’d stay consistent with that,” Kamangar said.

Detailed Data & The Invisible Web

Now back to details and the invisible web ie, content locked behind database walls, inaccessible to regular spidering. Google Base presents a possible boon for this type of information.

Want to export your entire job database to Google and not afraid to do so, since you control the destination URL and will just send people back to your own site? Google Base makes that possible! Just upload your data, and all your categories/fields will be translated into “details” of your choosing. Want to share with Google the temperature database you’ve been keeping for your city, something they’ve never spidered because it was only accessible through a search box? Now you can!

This is a real advancement, and it’s one I hope we’ll see improve in two ways — the ability to have private databases and named databases.

For private databases, I mean that Google Base is a simple way for anyone to create a collection of names and phone numbers for the local soccer club. But you don’t want the world to have access to that information, only people you choose. Private databases would be helpful.

By named database, I sort of mean mini-Google Bases. If someone’s created an exceptionally good set of information, I want to be able to search directly against just that information, rather than all of Google Base. It’s a pain to have to hope or figure out that refinement will let me do that. Give me the ability to name and bookmark a particular database.

For example, imagine an entomologist who uploads a database about insects, maybe a really cool one accessible to ordinary people. Want to search against just that data? It would be nice if the data set could be named and have its own custom home page that people could be directed to.

From Google Base To Google Itself

By the way, what goes into Google Base won’t necessarily stay in Google Base. That is, Google fully intends for material in Google Base to surface within regular Google results, in some way.

How? Via a OneBox display? In place of some “regular” listings? Google doesn’t know yet.

Google Base & The Master Plan

Let me cap off with the big picture question. How’s this fit into the Google master plan?

“We think of it as an extension of existing content efforts we have now,” said Kamangar. “We didn’t have a general way for people to push information to us that they didn’t think was being represented in our search results.”

For example, Froogle started out with Google scraping web pages and trying to guess what was a product name, a price and so on. But it was much more accurate to let merchants upload feeds that told Google exactly this type of information.

Google Base is a way for Google to let anyone upload information to Google about anything. That’s the master plan. Exactly how that master plan will unfold isn’t clear. Maybe there won’t be any particular date types that are uploaded. Maybe it really will turn into a great place for those with classified listings that will lead to a dedicated spin-off service. The overall goal seems to be put this tool out there and see what people make of it.