selberg.org Home Home

Bringing us the best of 1995 to you here today!

I haven’t had a search post in a while, but there’s some posts out there I thought I’d talk a bit about. In John Battelle’s book The Search, he expands a bit about where search ought to be in the last chapter of his book. Our own Robert Scoble adds his own $0.02 on the perfect search, followed by a response by Danny Sullivan over at Search Engine Watch.

Here’s the gist: We used to have great features, many powered by people, and we don’t anymore. We want them back.

Yes and no.

There’s a lot less human editorial in search nowadays, and that’s a good thing. It allows us to scale, not just in the size of the web, but also globally. Anyone know what the best page for comparing cellular plans is in West Africa? Me neither, and we’re not hiring editors to figure that out. That’s why smart people develop algorithms to look at the web — links, body, etc. to figure that stuff out.

Anyone remember Magellan by McKinley? It was an early competitor to Yahoo back in about 1995. The idea was to hire a bunch of human editors to create a large “best of the Web” version of Yahoo, with each site rated and quality checked. McKinley isn’t even around anymore, at least according to the new real estate page at www.mckinley.com, let alone Magellan.

That being said, one thing we aren’t doing all that well at is enabling people to spread their own information. Yahoo is making some moves here with their social searching stuff, but it’s not even close to what I want.
For example, for an upcoming trip to China, I’m trying to figure out if GPRS works on a China Mobile prepaid SIM card. So I use the query “China Mobile prepaid SIM GPRS setup.” Google is crappy, so is ours at MSN (and they even look crappier), and Yahoo’s results don’t completely suck.
.

This should be an easy query, if not for the potential cross-lingual issue here. All I want to know is for a China Mobile prepaid account, how do I set up GPRS? Presumably, there’s a FAQ on the China Mobile website. Probably in Chinese, but I suspect it’s there. Or at least on some other support websites, as it seems that it’s there, just not quite what I want.

But yet nobody really gets it.

Human editors aren’t really going to help with this, but human beings might. This is where blogs and forums come in, especially forums. I managed to find a few where this question had been asked and answered, and so I’m pretty sure I know what to do now.

So now the question comes… how do we enable and harness human knowledge in a meaningful way? Not just indexing documents on the Web, but truly indexing knowledge for mass consumption. That’s the question. And while we used to have lots of nifty features on previous (and some still living) search engines, we still haven’t begun to tap into searching human knowledge. It’s just that tough of a problem.

Oh, and I’ll except this from Danny’s post, worth the read:

FYI, John asked me for thoughts on The Perfect Search that didn’t make the cut for the book. But if you’re curious, here’s what I emailed him back last September. I’ll ask him if I can add in the email I was responding to, that puts my response in better context. If so, you’ll see it added here later. My response:

I can’t imagine such a world. It makes a nice pitch for the search companies, but knowledge is a messy thing.

If we’re talking about indisputable facts, it’s a bit easier. Thomas Jefferson was the third president of the United States. I know of no one who questions that.

Who was the first person on the moon. Neil Armstrong — unless you are of the contingent that believes moon landings never happened. OK, I think those folks are crackpots. But the perfect search that comes back with Armstrong isn’t the perfect search for them.

What’s the answer to gay marriage? Who killed Kennedy? Was Bush right or wrong for going into Iraq? Is the MMR vaccine safe for children?

None of these can be answered definitively. They’re more than just questions with nuances. They’re questions that have answers ultimately determined the by reader themselves, answers that may be different for each person, based on what they choose to believe after reviewing many opinions.

I can envision a system that tries to collect for you a variety of references on topics. Maybe it even assembles them into an encyclopdia-like, wiki-like page. The assemby of this knowledge might be considered “answers” by some. To me, it still represents the start of a knowledge quest. It’s akin to exactly how search works now — a list of references, with the searcher still needing to explore.

I’m sure we’ll see search advance on simply pointing people to the easy stuff, the facts that can be produced, direct navigation to web sites and so on. I’m also sure we’ll see search improve to better understand what we’re interested in, based on past habit and visits. But all knowledge will never be accessible, unless they figure out a way to digitize the minds of everyone living and dead. Even when dealing with what knowledge we do have chronicled, distilling a perfect answer is impossible. God could provide a perfect search as you outline. Search engines aren’t God today, and they’ll never be.

Having said this, I was agast last year when some Wi-Fi exec likened Google to God in Friedman’s column. While we may not have the perfect search, nor will we, some people may believe search engines (and the web by extension) already offer it.

We’ve had articles about judges searching the web themselves to see if they can dig up evidence. Fox News lamely tries to defend calling the BBC anti-American by citing search counts. Students apparently are abandoning traditional research methods and assuming the magic little search box brings up the right answer. I’ve watched people spend tons of time searching for a company’s phone number rather than just calling information. Two television shows I watched this week had characters talking about how they “Googled” something, with the assumption that whatever they retrieved must be correct. Some people already believe a perfect search tool exists, and the way it is shaping them is that they’re relying on it too exclusively.

So the threat is this. In a world where people believe a perfect search exists, that world may fail to seek out knowledge in other ways. Someone blogs something that’s factually incorrect. Search picks this up. There are no other references out there. Search is perfect, ergo, what’s wrong becomes right. No one bothers to actually follow up on the fact.

I was fortunate enough in college to hear Loren Needles from Analytica talk about the need to fully question any facts. At the time, he talked about how a recent hurricane had been blamed for a dropoff in some economic indicators. In short order, he quickly demonstrated how there was no way the hurricane could have cause a dropoff of such extent. Despite this, newspapers across the country accepted the explanation as fact.

That’s what a perfect search potentially does for us, makes us less questioning because we think the answers are all in that little box. They aren’t, nor will they ever be.

7 Responses to “Bringing us the best of 1995 to you here today!”

  1. Cybermagellan Says:

    Why not offer an algorithm based on the six key questions we all have, Who?, What?, When? Where?, Why? and How? Using this to perform your search it should have came back with results similar to “how do I set up GPRS?” with a link to the China Mobile website. I’ve been thinking about this for awhile now and find that most the time when we go to Google,MSN,AOL, or Yahoo! we find ourselves searching on a question that we have.

    Offering the answer back as to that question this should be able to pinpoint better than a static search. Just my two cents.

  2. Erik Selberg Says:

    So one thought is thinking about query intent… e.g. if the query looks like a “How do I…” then returning pages that look like they answer “How do I….” questions. It’s a refinement step, certainly.

    Here’s a more concrete example… I get a bunch of hits on this blog (not to mention links) to my post on unlocking an AudioVox 5600 for free. The reason is after I figured out the instructions from a ton of different sources, I decided to put up The Answer and make things simple for people.

    That being said, there’s a lot of The Answers out there for questions that haven’t been put up by someone… and I’m sure a lot of it is simple knowledge that once you go through it, you know and forget. Like setting up GPRS, for example.

  3. Cybermagellan Says:

    Right…so to help index it better if you knew a search engine used a more human approach would be to ask like Q: How do you unlock an AudioVox 5600. Then post A: And give your steps. That way when the spider hits your site it would index the part that says Q: “foo” or whatever and then when someone ask that question to the search engine the relevency would almost match at 100%.

    I posted more about it at Perfect Searching is done by humans (Sorry if that comes out plain text.

  4. Paul From Nata1 Says:

    I agree.

    On a limited corpus (i.e. for a search appliance), then human rating, pinning, adding extra meta data is extremely extremely useful (why I loved building .Search on top of Community Server)

    For an unlimited corpus - no way! That’s what we’re talking about right? Users altering ranking modifiers as opposed to bots? I agree 100% then that humans need to be creating/updating rules in the rule engine, but definitely not altering ranking modifiers a site at a time!!!

    Is it ever really possible to ‘watch’ peoples behaviors when they search, and make meaningful ranking modifier changes, or do what I do in .Search - pin web pages/web sites?

    I love Scoble’s search posts - its good to see them get attention. I’ll have to read this exchange over and over to truly ‘get it’

  5. Matthew Says:

    Erik, I think the key is to analyze how the user manipulates, uses and view search results. Problem is, right now we don’t have much to analyze because the UI is limited - there’s not much the user can do with Google, MSN or Yahoo results besides scrolling through them and clicking on them. Not a very efficient workflow to begin with, but also not a particularly good environment for “handling” data. I’m working on a methodology which incorporates UI, and I think that’s going to be the next big thing.

  6. Cybermagellan Says:

    Is it ever really possible to ‘watch’ peoples behaviors when they search, and make meaningful ranking modifier changes, or do what I do in .Search - pin web pages/web sites?

    yes it is…and the higher a certain link gets clicked the higher up the rung it should get till it is #1 or whatever. Eventually you’ll have multiple #1s however the point being is that these will eventually filter down into taste of content. If you think about it right now it is possible however for a particlar way. You have to actually have metadata that says Q: FooQ A: FooA and then have someone search for FooQ. However what would happen if you were to attach just on the sly a What+FooQ, Who+FooQ, When+FooQ, etc.

    I would be interested in perhaps researching how this would work….perhaps making something to track items and see how they are ranked…Erik, I pimped out your blog post so sorry if you get a ton of hits :-D

  7. Erik Selberg Says:

    So… yes and no as far as the UI goes. You can certainly instrument a rich (or AJAX, or whatever) client that reports every little movement back to the system and have the system attempt to learn from that behavior. Nobody has done that yet without making something totally unweidly, but perhaps some of the new AJAX stuff would enable that. And you know with all the coolness that we’re seeing, somebody, if not everybody, will try dumping a ton of fun AJAX code onto search results to make them more interactive. Sort by Date! Collapse Snippets! Export to XML! Mark some as relevant, mark other results as not relevant, and requery!

    The bigger issue is whether that actually help people.

    These features invariably help people with answering hard or unanswerable queries… such as researching into whether or not Mier would make a good associate on SCOTUS. But it doesn’t help for simple queries, or even slightly complex queries that are very answerable, such as “unlock audiovox 5600.” So, I’m left wondering whether we do in fact want one interface that does everything, or two (or more)… one for simple, direct queries, and another that’s more of the researching / answer hard questions route.

Leave a Reply