selberg.org Home Home

1/02/09
1:25 am
The tedium of the Search Wars

I haven’t been blogging as much lately for a variety of reasons… but the biggest is that the Search Wars seem to be, well, over. Certainly Microsoft is still very much agitating about them, and the Yahoo! death watch continues, but… there isn’t anything there that I can see. More to the point, now that I’ve been out for over a year, a few things come to mind:

Awareness

Here’s the biggest problem I see with Microsoft’s search now that I’m not working there and in the middle of things. Most people aren’t aware that they’re an option. People know about Google, they know about Yahoo. They remember MSN, but people don’t equate that with Search, and nobody who isn’t in the industry knows about Live. There’s just zero brand awareness there. And so far, nothing has changed in the past year. Now, I know Microsoft is planning on rebranding and relaunching Live (again) this spring; I suspect we’ll see a big SuperBowl splash to get the word out (hopefully better than last time, which was a lame commercial that was the 2nd in after the half-time show!). But when will they actually be in the game?

A reason to care

I’m tech savvy, and I pay attention to what happens in Web sites. But as a customer and user of search, I’m blissfully unaware of anything that would attract me to use Live. OK, they have an image on www.live.com, and they bought FareCast and are doing this CashBack thing. Um…. OK. Really, that’s it? And I know that just because I’m sorta paying attention. But it seems largely that there’s really no feature additions that would attract new people and generates people talking about them. And it’s not like there aren’t ideas out there. Where’s a People Search?  Or how about when I type in “Review D700″ I get, you know, reviews, and not pages trying to sell me the D700 that have a link saying “Review yours here!” Personalization maybe? I mean…. come on Harry and Brian, give me something to talk about!

BRIC

Everyone out there either uses Google, or doesn’t. The above spoke about converting people who use Google to switch. Now, let’s talk about the group that doesn’t use Google. Who and where are these people? Well, most of them are in countries that are rapidly coming online - Brazil, Russia, India, and China, aka BRIC. They don’t use Google because they don’t use anything. Thus, they’ll pick the best when they come online. So where’s the huge international expansion there? Google’s pushing there, and they sort of win by default by just being the dominant player. But Microsoft, which has footprints globally, doesn’t appear to be pushing here at all.

Mobile

This is a tougher one, but strikes me as another miss already. One of the strategies that Microsoft has used is to wait for a major paradigm shift and have the right product at the time, thus winning over converts who make the shift. For example, Office was the paradigm shift that enabled Word and Excel to dominate over WordPerfect and Lotus123 — a single business package that had the major software applications people used and that worked together. What’s the next paradigm shift? Well, I don’t know if it’s a huge shift, but I’d say Mobile is the best bet for one. However, while the Live Search app on Windows Smartphones is pretty nice, it’s really just Live Maps. And turns out Google is better on SmartPhones, at least in my opinion, because it loads faster and runs faster. but the bigger issue here is that nobody uses Windows Mobile… the original iPhone has outsold all Windows Mobile devices ever sold combined. Score another one for Google. Yeah, Google is competing with Apple with the Google Phone, but who cares… Google is still on the dominant player (iPhone), and RIM (BlackBerries) seem to rapidly be losing their luster, even if Microsoft buys them. So…. there anything there?

OK… so there’s my four reasons on why I don’t see much excitement in Search. I wish there was something there. Certainly, a product space is interesting when there’s good competition and good features. Remember the mid-90s? There were a bunch of search engines, and people I knew DID compare one to the other. Certainly there was a lot of, “Oh, I use Lycos / AltaVista / Excite / etc. because it’s better” type comments, but all these guys were still offering new features - like image search when it came out, or stock quotes, or whatnot. Things that made people talk, and want to try out a different engine. That’s what’s missing…. there’s no talk, no reason to try something new. People are becoming set with their tool — and until something changes in a major way, the war is done.

1/02/09
12:20 am
2009 Predictions

OK… since I demonstrated how uncannily uh, inaccurate I am, here are MORE predictions for 2009. Remember, I’m under 50%, so best bet is to bet against me!

Politics

  1. Senator Al Franken
  2. Ms. Caroline Kennedy
  3. Universal Health Care Plan introduced

Search

  1. Microsoft rebrands / relaunches Search. Again.
  2. Microsoft query share remains within 3% of what it is today.
  3. Microsoft buys RIM, pushes Live Search as default Web search for RIM
  4. Google stock battered, as economy hits advertising hard (hits others worse)

Sports

  1. No movement on Seattle basketball team
  2. Mariners, Seahawks, Huskies all continue to suck
  3. Steelers head to Miami!

OK… I’ll stop at 10 predictions. We’ll see how things are in a year. Happy New Year!

1/02/09
12:00 am
2008 Predictions Reviewed

Well, let’s see how I did from my predictions for the year:

Hey everyone,

Been a restful and quiet holiday season here at the ranch, so haven’t been posting. I know I know. At any rate, I thought I’d state some bold, crazy, and just downright mind-bottling predictions.

Politics

  • Huckabee beats out McCain for the Republican nomination.
  • Clinton defeats Obama and Edwards.
  • Clinton then defeats Huckabee in the General Election, making Dennis Miller’s predictions of Bush-Clinton-Bush-Clinton a reality. In an ironic twist, people desire a “Bridge to the Past” — in particular, the world prior to the Bush Administration. Booming economy, no war, good times.

Search

  • Google picks up another 5% share. Microsoft starts to spend serious money, still doesn’t buy Yahoo.
  • Baidu solidifies in China in a big way.
  • Emerging Markets become competitive (South America, India, Russia, etc.) towards the end of the year.
  • IAC (Ask.com) merges with Yahoo.
  • Personalization (personalized search and personalized ads, in particular) become differentiating features.
  • FaceBook Web Search appears (powered by Microsoft). Google pushes their own FB search app heavily.

Football

  • New England gets taken down.

Basketball

  • Clay Bennett keeps trying to move the Sonics, but can’t get out of the lease until 2010. Both sides dig in. Resolution in 2009.

Personal Hopes

  • Randy Pauch sees 2009.
Right… so I went 4 and 7. Who listens to me anyway? :)
One that I’m saying I missed….for everything that’s been said about personalization, it still really hasn’t become a differentiating feature…. I mean, do YOU use personalization, anywhere? Does it feel that something is being personalized, and it’s a good thing? If it’s there, it’s an invisible feature, which means it’s very hard to care about it. Ah well.
Now on for 2009!
12/04/08
2:00 pm
Interview at FederatedSearchBlog

Sol Lederman interviewed me as one of the MetaSearch Luminaries on the Federated Search Blog. Brings back some memories!

He’s done it in a couple of parts, so I’ll update them here:

Questions
Part 1

9/23/08
7:15 pm
Jan Pederson is now at A9

Come on ValleyWag, it’s been two whole days!

(for those keeping score, he joins Daniel Rose. Both were most recently at Yahoo!)

7/25/08
1:25 pm
RIP Randy Pausch

Covered just about everywhere, Randy Pausch has died at 47 from pancreatic cancer. I predicted this year that he’d made it to 2009 on pure hope and faith in his spirit, but sadly, no.

I won’t say much on this, as a number of people far more eloquent than I are. However, I took this screen shot of CNN, which features Randy while giving his “Last Lecture.” It says something quite profound about a person when their death makes the front page nationally.

Rest in peace.

cnn-pausch.JPG
5/25/08
11:20 pm
Google’s trust-building

When I interviewed at Microsoft many years ago, one of my interviewers asked me what I thought the next big thing for search was. I said: “trust.” Right now, people get pages back, but there’s still a huge degree of distrust on what they see. People trust Amazon.com, and (for better or worse) seem to trust Wikipedia. But random sites? Hmm. Some people are generally trusting, but many aren’t, and the continuous stories of identify theft and credit card theft make people more paranoid (which is probably a good thing).

I still stand by my statement. Of the “next big things” for search people keep talking about, such as blended search, personalization, social search, etc. I still believe that trust will be the big differentiator. There is a lot of crap out there, and I suspect it’s growing a lot faster than quality pages.

Which brings me to the following. The other day, I received the following in my inbox:

Dear site owner or webmaster of selberg.org,

While we were indexing your webpages, we detected that some of your pages were using techniques that are outside our quality guidelines, which can be found here: http://www.google.com/webmasters/guidelines.html. This appears to be because your site has been modified by a third party. Typically, the offending party gains access to an insecure directory that has open permissions. Many times, they will upload files or modify existing ones, which then show up as spam in our index.

The following is some example hidden text we found at http://selberg.org/2008/02/:

buy viagra
buy viagra online
viagra online
discount viagra
order viagra
cheap viagra
generic viagra
generica viagra
viagra buy
viagra price
order viagra online
viagra generic
viagra pill
where buy viagra
buy viagra cheap
viagra order
get viagra
buy online viagra
online viagra
viagra sale online
where to buy viagra
cheapest viagra
purchase viagra
cheap viagra online
viagra buy online
buying viagra
buy viagra on
generic viagra canada
prescription viagra
buy viagra norway
generic viagra pack

[...]

In order to preserve the quality of our search engine, we have temporarily removed some of your webpages from our search results. Currently pages from selberg.org are scheduled to be removed for at least 30 days.

We would prefer to have your pages in Google’s index. If you wish to be reconsidered, please correct or remove all pages (may not be limited to the examples provided) that are outside our quality guidelines. One potential remedy is to contact your web host technical support for assistance. For more information about security for webmasters, see http://googlewebmastercentral.blogspot.com/2007/09/quick-security-checklist-for-webmasters.html.

When you are ready, please visit https://www.google.com/webmasters/tools/reinclusion?hl=en to learn more and submit your site for reconsideration.

Sincerely,
Google Search Quality Team

My first reaction was, WTF? I run my own blog, and I know I’m not spamming. Somebody phishing me? Nope… links are legit… so I go to the page in question, and sure enough on my “My advisor’s WSDM” post, there was a hidden block with a number of links:

<font style="position: absolute;overflow: hidden;height: 0;width: 0">
<a href="http://www.bigbadbookblog.com/?menu=1" title="buy viagra">buy viagra</a><br />
><a href="http://www.bigbadbookblog.com/?menu=2" title="buy viagra online">buy viagra online</a><br />
<a href="http://www.bigbadbookblog.com/?menu=3" title="viagra online">viagra online</a><br />
...

And since it was on that post, it was also on the archive post for Feb, which was the link Google found.

I went into panic mode, and first edited the post, then went to Google to ensure my blog wasn’t removed. I was led through an interesting process, where I ended up registering my site, then registering myself as owner (by putting in a special META tag and having Google confirm it was there), and then acknowledge that I’m behaving and all is well.

This is a great process, in that they now have a known owner for a site they can contact. And they know the site is active and somebody on the other end cares… whether they’re a spammer or not, to be determined later. It also means that Google can now alert me to problems with my site… such as poor indexing, or if my site or a post gets hijacked again (still not sure how it happened, but I’ve updated WordPress at least!). Google is doing a ton of analysis and have published some of what they’re doing, such as massive map-reduce scans looking for malware landing pages, in a technical report “All Your iFrame Are Point to Us.” They’re also highlighting sites that they believe may be harmful to your computer (but aren’t sure of enough to remove from their index).

Upon reflection, here’s what they’re doing, and here’s what I now believe:

  • Google is building up a network of sites and site owners, getting to know them better;
  • Google is creating a framework to help registered site owners ensure their sites are legit;
  • Google is actively trying to identify and remove bad and malicious content from their index;
  • Google is being (surprisingly) public about what they’re doing.

What this means to me is that the sites that appear on Google are likely more trustworthy than they are on competitors, such as Microsoft. Now, I know Microsoft has a security group, and they’re doing a lot to go after malware and phishing (for example, the recent anti-phishing plugin on IE is a great step in that direction). But are they connecting all the dots? And are they doing so publicly? Because frankly, without telling people what you’re doing for them, they’re very unlikely to give you proper credit for what you’ve done.

5/20/08
11:10 pm
Live Search Cashback is up

Check it out now… http://search.live.com/cashback

Lots of coverage out there; I saw Todd report it first.

I’ll post some more tomorrow when I can play with it a bit more. Initially, it looks like Live Search Cashback is basically eBates on steroids. They’ve got a number of merchants signed up, and they’re doing both comparison shopping as well as splitting the commission on sale with the purchaser, thus giving a good reason to do comparison shopping @ Live Search. However, unlike Amazon’s Pro Merchant program, which I highly recommend, the customer goes directly to the shopping store. Amazon actually facilitates the transaction, thus the merchant in question doesn’t actually see the customer’s billing information and the merchant must abide by Amazon’s A-Z guarantee, esp. on returns. This is nice when dealing with notoriously sketchy merchants, such as digital camera dealers in New York that aren’t B&H Photo Video; it forces them to behave (and they are on Amazon!).

5/04/08
1:36 am
And so it ends

It looks like Microsoft’s attempted acquisition of Yahoo! has come to an end. Apparently, $46 billion wasn’t good enough, but $50 billion would have been. So what’s $4 billion between friends? Ah well. Mini already has a post up having popped a cork, and I’m sure MSFTExtremeMakeover will have something shortly. And I’m sure there will be plenty of analysis posts as to why, why it’s a good thing, why it’s a bad thing, what might have been, and so on.

So here’s mine!

First, so if I understand properly, Microsoft bid $41 billion, Yahoo! wanted $50 billion. So Microsoft came up $5B more, met ‘em halfway… Yahoo! still wanted the full $50B. OK… so if you can come up with $5, why not $10B? And yeah, I understand, these are scarily huge numbers. But hey, if you’re going to sit down at the World Cup of Poker, you know it’s not a $10 buy-in. I actually wonder if it’s too much of a bet-the-company move… e.g. Microsoft can currently afford anyone that’s $46B or less, but more… not so much.

Second… so what’s next? Well, let’s see….

Option 1: Keep at it! Keep at it! Keep at it!

Well, Satya, Brian, Harry, and the gang have to do something. And now that they won’t have too much of a distraction integrating Yahoo!. Plus, this means that most of Microsoft will now align very closely with services, focusing on ads and search. A search bar in every application, every desktop, every skin. And renewed focus on new frontiers, such as XBox and mobile - especially XBox.

Option 2: Buy! Buy! Buy!

Buy someone else! Or elses! But who? Well, how’s this little gem from comScore:

Baidu Ranked Third Largest Worldwide Search Property by comScore in December 2007


To aid in your research and coverage of Baidu’s recent announcement to enter the Japan market with www.baidu.jp, relevant comScore qSearch worldwide data are provided below.

In December 2007, 66.2 billion search queries were conducted worldwide.

In December 2007, Baidu.com Inc. was the third ranked search property worldwide with 3.4 billion searches, capturing 5.2 percent of worldwide search share.

Worldwide Search Top 10
December 2007
Total World Age 15+, Home and Work Locations*
Source: comScore qSearch 2.0

Searches (MM)

Share of Searches

Total Internet

66,221

100.0

Google Sites

41,345

62.4

Yahoo! Sites

8,505

12.8

Baidu.com Inc.

3,428

5.2

Microsoft Sites

1,940

2.9

NHN Corporation

1,572

2.4

eBay

1,428

2.2

Time Warner Network

1,062

1.6

Ask Network

728

1.1

Yandex

566

0.9

Alibaba.com Corporation

531

0.8

Baidu is the dominant engine in China, NHN is www.naver.com, which is the dominant engine in South Korea. Oh, and today, 5/4/2008, NHN is worth about $11.25B (current price, in KRW), and Baidu is worth $12.36B (current price in USD).

Naver hasn’t shown any propensity to move outside of Korea, and for the most part their stranglehold on South Korea is their huge question and answers site (which is what Yahoo! Answers, Microsoft QnA, and Baidu’s iKnow are based upon). Their search, last I knew, wasn’t terribly great.

But Baidu…. Baidu is doing real search. Baidu just launched in Japan earlier this month. And they have the currently dominant question and answer site, although TenCent, which runs QQ, the dominant instant messenger in China by far, is looking to create their own version that may cause some trouble. And Baidu has got heavy competition from Google.

Now, there are certainly issues with buying Baidu due to the Chinese government. But… well… at the end of the day, those Yahoo customers aren’t going anywhere quickly - not to Google, not to MSN. That’s one of the key reasons why, IMHO, Microsoft wanted to buy them. But that isn’t happening, so those customers stay with Yahoo. Now, Microsoft still needs to get some additional customers somehow, somewhere. If not from Yahoo, and if not from Google… well, for me, I’d start looking abroad really quickly myself.

4/24/08
11:04 pm
William Chang, CTO Baidu, on Search

Dr. William Chang, gave a number of talks at WWW2008. A brief history: PhD from Berkeley, one of the main developers of InfoSeek, long time advisor and lately (Jan 2007) CTO of Baidu. Here’s some distillation of them, and as always just my interpretations of what he’s said:

Working in China
He spoke at length about the issues of having a company in China; in particular, the perils of having an established non-Chinese company try to move into the Chinese market. He gave a number of examples of companies that haven’t done well, such as Google and Yahoo (Baidu being dominant in search), eBay (being overcome by TaoBao), and Tencent (runs QQ, dominant instant messenger in China). I need to ask him how he views Joyo.com (Amazon.com.cn!). Anyway, various keys he mentioned to success in China (there were a few others, but these were my main takeaways):

  • Be focused on China. Other companies were moving in as an after-thought and focused more on their primary markets (typically the US).
  • Sell to the market you have, not the one you want. OK, pardon the pun, but essentially, non-Chinese companies often assumed China would be like the US and Europe, and have tried to market their services as such.
  • Managers should be local and domain experts. In other markets, the US for example, managers of a division would typically be domain experts of the business as well as local and able to speak the language. Often, foreign companies would send either a domain expert that couldn’t speak Chinese, or someone that was local but not a domain expert. Neither is a good choice.
  • Ground troops are plentiful. Use them. It’s cheap and easy to get big fast, so ground troops are necessary. Apparently Baidu has well upwards of 3,000 sales and another 3,000 indirect sales people - well in excess of about ~1000 developers. Most are graduates from local colleges - apparently nearly all hiring is straight from college. Yes, this requires different types of management; Chang described a poll-like model where people are rated on 10 categories, and this allows management to improve both people as well as groups along these axes. Microsoft does something a bit similar with the MS Poll, but that’s more for identifying problems with managers and groups.

Overall, the first and last were the two key points I took away. If you assume China is a different market, which having been here a few times now I can certainly accept, then it makes absolute sense to have a local division be completely empowered to focus on the local market and get lots of ground troops to do it. From prior experience, if the organization has a central (say US) based development effort and some small feature is needed for a remote site (say China), it can be frustratingly difficult for the US based group to do it given other priorities. Yahoo solved the problem to a degree in search by having a group dedicated to handling non-US requests that the core search team wouldn’t do. But really, when you can find ground troops locally, get them.

Three Generations of Search
Chang also spoke at length about the Three Generations of Search. I’ll cut to the chase:

  • First Generation (1993 - 2001): IR at scale
  • Phrases
  • Source rating / prior
  • Multiple facets vs single relationships
  • Second Generation (2001 - present): Data and heterogeneity
    • “Web Oracle” model - the web will know the answer
    • User-generated content
    • Tagging
    • Wikipedia / Baidupedia
    • Question Answering (Naver in Korea, Yahoo Answers in US, Sino’s iAsk and Baidu’s iKnow in China)
    • People Search (LinkedIn and FaceBook)
  • Third Generation (??): Internet as a Matching Network
    • Personalization
    • Integration of Search and Recommendation
    • Predictive recommendation with feedback

    For the most part, the culmination of the First Generation would be sites like AltaVista or InfoSeek. Second Generation brings us the current offerings from Google, Yahoo, and Microsoft. The Third? Well, he mentioned that Amazon was doing a number of useful things there ;) but nobody else had been successful yet. Which begs the question will the existing second generation engines turn into the third!

    More seriously, the main focus on the Third Generation is using massive data and computationally per individual versus per aggregate. Right now, Google, Microsoft, and Yahoo are able to make various generations about things using prior history of their millions of customers and sessions. However, the question is how to make use of an individual’s history and related people’s history to make meaningful personal recommendations. As an example, one of the papers I was most disappointed with was Spatial Variation in Search Engine Queries by Backstrom, Kleinberg, Kumar, and Novak (Cornell and Yahoo). It showed that queries emanate disproportionately from different regions; for example, baseball team names are very focused around their local region. Yeah, OK. And then? For example, do people click on different results on the same query issued from different locales? Or is there implicit, or explicit, meaning on terms in different locales? It was mentioned that “Cardinals” had two hotspots - Arizona (the football team) and St. Louis (the baseball team)… so presumably people want and will click on different things. But in what way? What are the metrics? And as you get closer to the midpoint of Arizona and Missouri, how do you merge in the results?

    I’m not sure if it’s an extension of the Second Generation or the Third, but I’d say solving those types of issues are clearly part of the next wave of things. And it’s good to see that Baidu is also pushing on that, in addition to the Big Three (and in fairness, Baidu is #3 world-wide… yup… and Naver #4. No wonder Microsoft wants to buy Yahoo! ;) )