Greg Linden makes some interesting points regarding metasearch, aka federated search. When I created MetaCrawler back in 1994, I did so with the belief that combining the results of multiple engines would provide better results than any single engine — such as WebCrawler, Lycos, InfoSeek, and OpenText. It turns out this is true. More importantly, it’s still true. Google, Yahoo!, and MSN / Live Search all provide good results, and when they differ a simple voting strategy to combine results makes the sum greater than any of the individual results.
So why isn’t MetaCrawler dominant instead of just a minor blip compared with the big three?
One of the key downsides with metasearch is performance — the metasearch engine is always a little slower than the average engine. But is performance the only issue? Or is it so dominant? Not clear.
What about operational reliability? An issue with federation is that each federated service needs to be available and reliable — each system needs to be able to handle the load. With federation, especially federating external systems, operational issues are more likely, and scale is more difficult. As query volume increases, each federated service needs to be scaled up appropriately as well.
Could it be the brand? Certainly, brand + quality is better than either in isolation, but does brand + quality surpass poor branding and better quality? Again, not clear.
I can’t say for sure… it’s probably some combination of all of the above. This being said, in thinking about it, I suspect operational reliability and to an extent control over data will win out from a business perspective. Federation means others are in control of part of the solution — and even if the others are part of the same company but different groups, it can still be difficult to ensure that the federated services are in sync with the entry point. So, to that extent, it should come as no surprise that folks like Google will advocate putting everything in one index.
“Combining the results of multiple engines would provide better results than any single engine … It turns out this is true.”
Hi, Erik. What do you mean by better results?
It is clear that metasearch can provide more comprehensive results than any one engine. No argument there.
But, it is not clear to me that it is easy to provide more relevant results. In particular, without access to the underlying indexes and raw data available to each source engine, it seems like it is very hard, perhaps impossible, to come up with a relevance rank that is as good at placing the most relevant documents in the top slots.
That is, I would think metasearch easily improves recall, but not precision. A simple voting scheme like you suggested seems unlikely to me to make precision better, and most likely makes it worse, than only using the best underlying search engine.
Is that not the case?
I mean better relevance (or better precision, depending on your term of choice).
Let’s say you have 3 engines, A, B, and C, and while they’re all pretty good, A is generally better than B, and B is generally better than C. The meta-engine gets 10 results from each and assigns them points from 10 for the top result down to 1 for the worst. Duplicates get the sum of points, so a result in Rank 3 for one engine and Rank 4 in another (8 and 7 points, respectively) would be above any singleton Rank 1 or Rank 2 result (10 or 9 points, respectively). As A is better than B, and B better than C, A’s results get an additional 0.66, and B’s results an additional 0.33. Sort the results in points order, and return the Top 10.
This relatively simple algorithm will get better results than just results from one engine (and there are plenty of smarter ways to do it; my thesis has a few, but that’s very dated now). It relies on the basis that A, B, and C are fairly close in terms of precision already, and that in general there tend to be lots of good answers. Essentially, in many cases you’re interleaving the Top 3 of 3 good engines. When there are duplicates, the voting mechanism causes support so if 2 vote, it’s very likely that the duplicate is the best, or a very good, answer.
Now, there is a bit slight of hand in the measurement here. Notice that I’m just selecting the Top 10 here. When A is better than B, what that usually equates to is that B’s precision @ N tends to be worse than A’s precision @ N, and as N increases, B’s precision @ N gets much worse than A’s. When N = 3, A, B, and C will be very close, meaning the top 3 results are all generally good. But as N increases, more crap comes in from B and C. Thus, the Top 30 from a meta-engine could well have worse precision than the Top 30 from the best engine, A (no guarantees, as A’s results from 11 - 30 may be equally crappy as B’c and C’s 4-10). But as users rarely go to Page 2, this is generally OK.
Another big caveat is that meta-search is a very sharp double-edged sword. If A, B, and C are close, quality improves. However, if A is MUCH better than B and C.. and maybe D, E, and F, then what you’re doing is injecting a bunch of crap into otherwise good results from A. This is essentially what was happening with MetaCrawler circa 2001… Google was much, much better than AltaVista, Lycos, Excite, not to mention some crap engines that were there for business reasons (Kanoodle and FindWhat, two Goto/Overture clones), that people were looking for the first Google result vs getting the benefit of metasearch. So again, as long as the underlying engines are close, but different, metasearch works really well both for recall and precision.