selberg.org Home Home

5/04/08
1:36 am
And so it ends

It looks like Microsoft’s attempted acquisition of Yahoo! has come to an end. Apparently, $46 billion wasn’t good enough, but $50 billion would have been. So what’s $4 billion between friends? Ah well. Mini already has a post up having popped a cork, and I’m sure MSFTExtremeMakeover will have something shortly. And I’m sure there will be plenty of analysis posts as to why, why it’s a good thing, why it’s a bad thing, what might have been, and so on.

So here’s mine!

First, so if I understand properly, Microsoft bid $41 billion, Yahoo! wanted $50 billion. So Microsoft came up $5B more, met ‘em halfway… Yahoo! still wanted the full $50B. OK… so if you can come up with $5, why not $10B? And yeah, I understand, these are scarily huge numbers. But hey, if you’re going to sit down at the World Cup of Poker, you know it’s not a $10 buy-in. I actually wonder if it’s too much of a bet-the-company move… e.g. Microsoft can currently afford anyone that’s $46B or less, but more… not so much.

Second… so what’s next? Well, let’s see….

Option 1: Keep at it! Keep at it! Keep at it!

Well, Satya, Brian, Harry, and the gang have to do something. And now that they won’t have too much of a distraction integrating Yahoo!. Plus, this means that most of Microsoft will now align very closely with services, focusing on ads and search. A search bar in every application, every desktop, every skin. And renewed focus on new frontiers, such as XBox and mobile - especially XBox.

Option 2: Buy! Buy! Buy!

Buy someone else! Or elses! But who? Well, how’s this little gem from comScore:

Baidu Ranked Third Largest Worldwide Search Property by comScore in December 2007


To aid in your research and coverage of Baidu’s recent announcement to enter the Japan market with www.baidu.jp, relevant comScore qSearch worldwide data are provided below.

In December 2007, 66.2 billion search queries were conducted worldwide.

In December 2007, Baidu.com Inc. was the third ranked search property worldwide with 3.4 billion searches, capturing 5.2 percent of worldwide search share.

Worldwide Search Top 10
December 2007
Total World Age 15+, Home and Work Locations*
Source: comScore qSearch 2.0

Searches (MM)

Share of Searches

Total Internet

66,221

100.0

Google Sites

41,345

62.4

Yahoo! Sites

8,505

12.8

Baidu.com Inc.

3,428

5.2

Microsoft Sites

1,940

2.9

NHN Corporation

1,572

2.4

eBay

1,428

2.2

Time Warner Network

1,062

1.6

Ask Network

728

1.1

Yandex

566

0.9

Alibaba.com Corporation

531

0.8

Baidu is the dominant engine in China, NHN is www.naver.com, which is the dominant engine in South Korea. Oh, and today, 5/4/2008, NHN is worth about $11.25B (current price, in KRW), and Baidu is worth $12.36B (current price in USD).

Naver hasn’t shown any propensity to move outside of Korea, and for the most part their stranglehold on South Korea is their huge question and answers site (which is what Yahoo! Answers, Microsoft QnA, and Baidu’s iKnow are based upon). Their search, last I knew, wasn’t terribly great.

But Baidu…. Baidu is doing real search. Baidu just launched in Japan earlier this month. And they have the currently dominant question and answer site, although TenCent, which runs QQ, the dominant instant messenger in China by far, is looking to create their own version that may cause some trouble. And Baidu has got heavy competition from Google.

Now, there are certainly issues with buying Baidu due to the Chinese government. But… well… at the end of the day, those Yahoo customers aren’t going anywhere quickly - not to Google, not to MSN. That’s one of the key reasons why, IMHO, Microsoft wanted to buy them. But that isn’t happening, so those customers stay with Yahoo. Now, Microsoft still needs to get some additional customers somehow, somewhere. If not from Yahoo, and if not from Google… well, for me, I’d start looking abroad really quickly myself.

4/24/08
11:04 pm
William Chang, CTO Baidu, on Search

Dr. William Chang, gave a number of talks at WWW2008. A brief history: PhD from Berkeley, one of the main developers of InfoSeek, long time advisor and lately (Jan 2007) CTO of Baidu. Here’s some distillation of them, and as always just my interpretations of what he’s said:

Working in China
He spoke at length about the issues of having a company in China; in particular, the perils of having an established non-Chinese company try to move into the Chinese market. He gave a number of examples of companies that haven’t done well, such as Google and Yahoo (Baidu being dominant in search), eBay (being overcome by TaoBao), and Tencent (runs QQ, dominant instant messenger in China). I need to ask him how he views Joyo.com (Amazon.com.cn!). Anyway, various keys he mentioned to success in China (there were a few others, but these were my main takeaways):

  • Be focused on China. Other companies were moving in as an after-thought and focused more on their primary markets (typically the US).
  • Sell to the market you have, not the one you want. OK, pardon the pun, but essentially, non-Chinese companies often assumed China would be like the US and Europe, and have tried to market their services as such.
  • Managers should be local and domain experts. In other markets, the US for example, managers of a division would typically be domain experts of the business as well as local and able to speak the language. Often, foreign companies would send either a domain expert that couldn’t speak Chinese, or someone that was local but not a domain expert. Neither is a good choice.
  • Ground troops are plentiful. Use them. It’s cheap and easy to get big fast, so ground troops are necessary. Apparently Baidu has well upwards of 3,000 sales and another 3,000 indirect sales people - well in excess of about ~1000 developers. Most are graduates from local colleges - apparently nearly all hiring is straight from college. Yes, this requires different types of management; Chang described a poll-like model where people are rated on 10 categories, and this allows management to improve both people as well as groups along these axes. Microsoft does something a bit similar with the MS Poll, but that’s more for identifying problems with managers and groups.

Overall, the first and last were the two key points I took away. If you assume China is a different market, which having been here a few times now I can certainly accept, then it makes absolute sense to have a local division be completely empowered to focus on the local market and get lots of ground troops to do it. From prior experience, if the organization has a central (say US) based development effort and some small feature is needed for a remote site (say China), it can be frustratingly difficult for the US based group to do it given other priorities. Yahoo solved the problem to a degree in search by having a group dedicated to handling non-US requests that the core search team wouldn’t do. But really, when you can find ground troops locally, get them.

Three Generations of Search
Chang also spoke at length about the Three Generations of Search. I’ll cut to the chase:

  • First Generation (1993 - 2001): IR at scale
  • Phrases
  • Source rating / prior
  • Multiple facets vs single relationships
  • Second Generation (2001 - present): Data and heterogeneity
    • “Web Oracle” model - the web will know the answer
    • User-generated content
    • Tagging
    • Wikipedia / Baidupedia
    • Question Answering (Naver in Korea, Yahoo Answers in US, Sino’s iAsk and Baidu’s iKnow in China)
    • People Search (LinkedIn and FaceBook)
  • Third Generation (??): Internet as a Matching Network
    • Personalization
    • Integration of Search and Recommendation
    • Predictive recommendation with feedback

    For the most part, the culmination of the First Generation would be sites like AltaVista or InfoSeek. Second Generation brings us the current offerings from Google, Yahoo, and Microsoft. The Third? Well, he mentioned that Amazon was doing a number of useful things there ;) but nobody else had been successful yet. Which begs the question will the existing second generation engines turn into the third!

    More seriously, the main focus on the Third Generation is using massive data and computationally per individual versus per aggregate. Right now, Google, Microsoft, and Yahoo are able to make various generations about things using prior history of their millions of customers and sessions. However, the question is how to make use of an individual’s history and related people’s history to make meaningful personal recommendations. As an example, one of the papers I was most disappointed with was Spatial Variation in Search Engine Queries by Backstrom, Kleinberg, Kumar, and Novak (Cornell and Yahoo). It showed that queries emanate disproportionately from different regions; for example, baseball team names are very focused around their local region. Yeah, OK. And then? For example, do people click on different results on the same query issued from different locales? Or is there implicit, or explicit, meaning on terms in different locales? It was mentioned that “Cardinals” had two hotspots - Arizona (the football team) and St. Louis (the baseball team)… so presumably people want and will click on different things. But in what way? What are the metrics? And as you get closer to the midpoint of Arizona and Missouri, how do you merge in the results?

    I’m not sure if it’s an extension of the Second Generation or the Third, but I’d say solving those types of issues are clearly part of the next wave of things. And it’s good to see that Baidu is also pushing on that, in addition to the Big Three (and in fairness, Baidu is #3 world-wide… yup… and Naver #4. No wonder Microsoft wants to buy Yahoo! ;) )

    4/23/08
    10:50 pm
    Themes from Beijing

    I’m attending WWW2008 in Beijing this week. It’s turned into a big of a monster conference… nine simultaneous tracks over three days, not to mention a day of workshops and tutorials! Yow! And I’m seeing a number of colleagues from the usual haunts here as well. Both Kai-Fu Lee, head of Google China, and Harry Shum, head of Microsoft’s Live Search development, each gave keynotes, and I thought the themes on them was quite interesting and contrasting.

    Kai-Fu Lee’s theme was Cloud Computing, or moving to a world where data and computation was handled on remote anonymous servers and applications then ran. He gave an overview of a number of Google applications that ran on this - Search, Mail, etc. I was struck by one comment he made, which is that cloud computing frees people from the monopoly of a single company controlling everything. Except, of course, the company that runs everything in the cloud for you…. Meet the New Boss…. Same as the Old Boss! But digs at Microsoft aside, the path outlined was clearly focused on Web applications built out on cloud computing, with those applications all leveraging large scale, reliability, and naturally massive amounts of data to handle things.

    Harry’s talk was more of a Company Meeting talk, in which he handed the microphone to Graham Sheldon to show off some demos, in particular highlighting some of the cool things MSRA is doing as well as some of the latest on the Live Search release. They led off with what I thought was the best, which is some work from MSRA’s speech group that extracts speech from video and then enables you to see related videos while watching them. It was put together well, so it isn’t so much a “watch while on the Web” demo but “imagine you’re watching TV” video. I’ll see if I can’t find a link, but good stuff. Also shown was Guanxi, which tries to do a people / relationship search… in this case, it showed who was related to Bill Gates. They also showed a demo where you could do query-by-image, which would show images related to a target image. I need to ask some of my former UW colleagues who did things like QBIC (Query By Image Content). The demos of released Live Search features were focused on new features in the News and Local Verticals, including some cool stuff from the Maps team (which continuously produces some great stuff). Oh, and they have a few things on health they’re experimenting with, and trying to get things hooked up with the HealthVault.

    OK… so we have two “My company is doing cool stuff, come work for us!” keynotes. But do we have any insight here?

    Yes. Google, as widely reported often and everywhere, is busy making an operating system platform of cloud computing that they then build their services on. They’re not actually selling or providing a cloud - Amazon is, with EC2 and S3. But they’re creating the applications that depend on the cloud.

    Microsoft, on the other hand, isn’t really pushing the cloud platform. They have a number of components for that, but the demos shown are all slices on search. But they’re certainly not talking about the power of their platform; they’re talking about cool features. But I worry along that line. The problem they have, which they and Google are trying to address, is user flow. Users don’t go to a vertical, they go to search. So now the problem is to discover intent on when it’s appropriate to show essentially a house ad for a vertical with some content, and then create a compelling, and consistent, experience as a user moves from “search” into “news” or “health exploration” or whatever they’re doing.

    What I can’t help but wonder is why neither appears to be really pursuing differentiated domains and brands. For example, I still don’t think of Google, Yahoo, nor Microsoft when I think “news.” I think CNN. And really, I don’t think “news search” so much, I want more of a news paper. Archival search is great, but should be from within the news portal. To that degree, I wonder why “Live News” isn’t more MSNBC, or even just a different URL, such as www.livenews.com (it’s some random news site… probably buyable!). Certainly there’s lots of direct visitation to www.youtube.com, and I’m still more familiar with www.mapquest.com than the URLs for Google, Yahoo, or Live maps.

    Anyway, food for thought… as always, I’ll lie about updating this later as the conference progresses.

    Update 4/25: We (a number of anonymous conference delegates, and yours truly) now have short synopses on all the keynotes. In order:

    • Kai-Fu Lee, Google: Use our stuff!
    • Harry Shum, Microsoft: We have stuff!
    • Sir Tim Berners-Lee, W3C: I invented stuff!
    • Robin Li, Baidu: I paid for this stuff!
    • David Belanger, AT&T Labs: We route stuff!

    In fairness, we’re sort of making up Robin Li’s synopsis. Sir Tim’s keynote was somewhat, uh, long and rambly, and after about 30 minutes of it the audience in the Great Hall of the People got restless and started heading to the drink counters for more beer and wine. Sadly, by the time Robin got to the stage, the audience was in no mood to listen and was already engaged in conversation, so we’re not really sure what he said. But Baidu did sponsor the banquet, which rocked, so we thanked him for that.

    David Belanger’s keynote was the best in my opinion… and not just because he didn’t do either a passive-aggressive product placement speech or an aggressive-aggressive product demo speech. He just talked about content, experience and devices, and networking to them and a lot of the challenges. For example, apparently as of 10 years ago when AT&T licensed out its rotary phone service, that was still upwards of a BILLION dollar business. For rotary phones. When a new touch-tone costs $10, or is often free. The main takeaways were that (a) there are loads of devices and enpoints, and it’s all increasing, and (b) the observation and re-iteration that old devices don’t go away slowly. The last is ignored at people’s peril… people hold on to things a lot longer than nearly everyone else would like.

    4/21/08
    7:05 pm
    Microsoft acquires FareCast
    So in a surprising move (well, to me at least… ) Microsoft purchased FareCast, the latest startup from my advisor, Oren Etzioni. for a paltry (heh) $115 million. FareCast is a great concept… simply take historical price data from the airlines, and predict whether the price will go up or down. I first wrote about them in 2006, when I said:

    Personally, I give ‘em less than a year before they’re bought by Orbitz, Expedia, or Travelocity. I’m not sure if I’d bank on the prediction model of whether I should buy a ticket now or not, given that if I wait, I might not be able to get the flight I want or end up with a crap seat, just to save something like $10 (and I’m just estimating that based on playing with it… if you can save substantially more, this may be much more interesting, but I’m doubtful things will be that rosy). However, if Expedia / Orbitz / Travelocity could, on average, save $10 per ticket, then they’d just clean up. They buy Farecast, and lower their published prices by $5. Let’s say Expedia buys them. Expedia can now undercut Travelocity and Orbitz with a $5 cheaper ticket — and in a world where people shop by price and have no problems going elsewhere to get a better price on the same ticket, this causes Expedia to win. Plus, for each ticket, Expedia is now getting $5 more — as they’re saving $10 from Farecast. How are they doing that? Well, they’re just going by what Farecast says… buy the ticket now, or wait a bit and buy the ticket later. They just have to eat the occasional higher cost when it is higher, but if Farecast works, then statistics will cause things to win overall, and Expedia (or whomever buys Farecast) will win. Simple as that. And I’m just using $10 as a guess here… if it’s more like $20, it’s an even bigger win and no-brainer.

    So, my prediction of less than a year (meaning a purchase by July 2007) was off by a year. But you should know by now how accurate my predictions are. ;) The more interesting question here is: Why Microsoft? Not clear (and of course the parties aren’t going to comment much until everything is settled). It appears that it’s mostly for the MSN Travel side of the house… which to me seems somewhat suspect given I don’t see why MSN Travel would push for $115MM unless they’re thinking of doing Expedia II. But perhaps they are! I’m also not sure if this is a search play… certainly searching for tickets is a great and interesting concept on a search engine, but knowing if a price will go up / down seems like a very minor feature on a search engine compared to how it could be used in the actual purchase.

    But, let’s get to the important business, which is congratulations for Oren and Jeff among other people there. Great work all around, and it’s fantastic to see another successful startup!

    4/06/08
    11:35 pm
    The #2 Strategy

    I haven’t posted lately on the Microsoft / Yahoo bid. It’s been interesting seeing things unfold… I thought Yahoo would have been much more receptive to the offer. Or, more to the point, I thought Microsoft and Yahoo had already come to terms and this was more the public drama. But apparently not.

    At the Microsoft company meeting last year, Steve Ballmer, in his big rah-rah speech at the end, mentioned that the first thing needed for the Search team was a plan to be #2. #2? Yeah… if you’re #3 (or #5, behind Baidu and Naver… sucks when your global service is smaller world-wide than a dominant local service in China and South Korea!). But let’s focus on the US market, so the goal is to move beyond Yahoo into the #2 position.

    How, exactly, will that be done?

    Seriously… there are millions of people who have Yahoo as their home page and use their search engine. The #1 query on Google is “Yahoo.” Yahoo, while declining in share a bit, is still huge and will hold on for years and years. They’ve held most of their customers, and their customers aren’t going anywhere. Not like Google hasn’t been around for a few years now.

    So, let’s say Microsoft makes a search engine better than Yahoo. OK… will that get Yahoo customers? Doubtful. Why? There’s already a better engine: Google. Hasn’t been a flood of people moving over.

    OK, so it’s a better engine… and better mail, messenger, portal?

    I don’t buy that.

    OK… so I’m not seeing a clear answer to get Yahoo customers to go to some newer, better thing quickly. So how about attacking Yahoo’s financials? Kill the ad network… compete on price, offer advertisers more for less. Lose tons, but it kills them.

    OK, let’s say that works. You kill Yahoo financially, destroy the asset. Some advertisers have gone to Microsoft, others to Google. Hopefully the market hasn’t come down, even though in a recession it will and it looks like we’re in one. But the customers really aren’t moving… so Microsoft still has to buy Yahoo in the end.

    So ultimately, it’s a question of buying a mostly healthy asset now at a premium, or dumping billions into weakening them for a later purchase… and hope that Google hasn’t just run away with things.

    Thus, ultimately, I think Microsoft has to purchase Yahoo, and came to that conclusion earlier this year. Will there be tons of conflict? Yup. Problems integrating? Absolutely. A huge exodus of smart talent up 101 to Google? Damn straight.

    But more to the point… how else can Microsoft get to #2? I don’t see it. So, they bet the company and try to buy Yahoo… cry havoc!

    4/05/08
    5:29 pm
    Upgraded to 2.5…

    been having random DB problems, so let’s see if 2.5 actually solves things.

    2/12/08
    9:55 am
    My advisor's WSDM

    My advisor, Oren Etzioni, gave the second keynote at WSDM today. His main focus is on his latest work which is a new paradigm in search using open extraction.
    His hypothesis: Use Machine Reading - Information Extraction + tractable inference. For example, who did what? E.G.
    IE(sentence) = who did what? - speaker(Alon Halevy, UW)
    Inference = uncover implicit information - Will Alon visit Seattle? (maybe… is UW University of Washington or University of Waterloo?)
    His argument, in part based on systems like Opine (done by Ana-Maria Popescu, another of Oren’s students and now at Yahoo Research), is information extraction enables very rich applications to be built.
    How do you make it work on the Web? Open IE - Self-supervised Information Extraction (Banko, Cafarella, Soderland, et al, IJCAI ‘07). This leads him to find triples of (Noun-Phrase, Verb, Noun-Phrase).
    More later (possibly even references… y’know, auto-tagging on blogs would actually be really useful…)


    buy viagra
    buy viagra online
    viagra online
    discount viagra
    order viagra
    cheap viagra
    generic viagra
    generica viagra
    viagra buy
    viagra price
    order viagra online
    viagra generic
    viagra pill
    where buy viagra
    buy viagra cheap
    viagra order
    get viagra
    buy online viagra
    online viagra
    viagra sale online
    where to buy viagra
    cheapest viagra
    purchase viagra
    cheap viagra online
    viagra buy online
    buying viagra
    buy viagra on
    generic viagra canada
    prescription viagra
    buy viagra norway
    generic viagra pack
    buy viagra in nevada
    buy viagra now online
    viagra online buy
    find viagra online
    buy cheap viagra online
    cheap generic viagra
    buy cheap viagra
    generic viagra online
    viagra sale
    generic viagra cheap
    buy viagra on line
    where buy generic viagra
    viagra online bestellen
    viagra prescription online
    generic online viagra
    low price viagra
    cheapest viagra price
    buy generic viagra
    viagra uk
    viagra online prescription
    cheap est viagra
    viagra soft tab
    viagra discount
    viagra cheap
    where to buy viagra on line
    buying viagra online
    buy viagra now
    purchase viagra online
    viagra pharmacy
    natural viagra
    buy viagra in canada
    viagra paypal
    viagra on line
    viagra 100mg
    viagra without prescription
    cheapest place to buy viagra online
    generic Cialis
    buy cialis
    buy cialis online
    cialis online
    online cialis
    order cialis
    cheap cialis
    discount Cialis
    generic cialis price
    cialis prescription
    buy cialis generic
    cialis online discount
    cheapest cialis
    buy discount cialis
    purchase cheap cialis online
    order cialis online
    cialis for sale
    cialis price
    purchase cialis
    cialis online pharmacy
    buy Cheap Cialis
    cialis story
    generic cialis online
    best cialis price
    cheapest cialis generic
    order generic cialis
    low cost cialis
    buy cialis generic online
    levitra
    buy levitra
    cheap levitra
    levitra online
    buy levitra online
    order levitra
    order levitra online
    cialis levitra
    generic levitra
    online levitra
    buy cheap levitra
    discount levitra
    levitra sale
    buy generic levitra
    levitra online pharmacy
    levitra price
    purchase levitra
    cheap levitra online
    levitra story
    levitra on line
    levitra prescription
    levitra cheap
    best price for levitra
    buy xanax
    buy phentermine
    buy lasix
    tramadol
    buy tramadol
    buy tramadol online
    tramadol online
    cheap tramadol
    order tramadol
    tramadol hcl
    ultram tramadol
    tramadol prescription
    online tramadol
    tramadol sale
    purchase tramadol
    buy cheap tramadol
    order tramadol online
    overnight tramadol
    tramadol cheap
    tramadol pharmacy
    discount tramadol
    tramadol hydrochloride
    tramadol 50mg
    cheap tramadol online
    generic tramadol
    buy clomid
    buy prozac
    buy cipro
    buy diflucan
    buy acomplia
    buy lexapro
    buy flagyl
    buy propecia
    order propecia
    cheap propecia
    propecia online
    order propecia online
    buy propecia online
    generic propecia
    compare propecia
    propecia without prescription
    propecia prescription
    propecia pill
    discount propecia
    online propecia
    cheapest propecia
    get propecia
    propecia order
    propecia price
    propecia uk
    propecia cost
    propecia sale
    purchase propecia
    buy cheap propecia
    propecia sale online
    buy online propecia
    online pharmacy propecia
    online prescription propecia
    buy generic propecia
    buying propecia
    buy propecia now
    buy fosamax
    buy kamagra
    buy clomid online
    buy prozac online
    buy cipro online
    buy diflucan online
    buy acomplia online
    buy lexapro online
    buy flagyl online

    2/11/08
    4:25 pm
    GSSP at WSDM

    Right now, there’s about 40 +/- 10 people from Microsoft, Yahoo, and Google in close proximity, along with other associated industry folks. So, as you can imagine, there’s a lot of scuttlebutt about the Microsoft / Yahoo acquisition. Here’s some of it (names withheld to protect the guilty… and hey, this is GOSSIP! Unsubstantiated rumor! Yellow journalism! Cite it at your own risk!)

    • A bunch of Microsofties are very wary of the deal, for reasons largely along the lines I mentioned previously.
    • A bunch of Yaholligans (Yahoochers?) are “just not thinking about it.”

    Neither side is saying they saw this coming (as opposed to search pundits everywhere). And, as near as I can see, everyone is just going about as normal. The Yahoo recruiting booth is sandwiched right between the Google and Microsoft booths at WSDM (eBay and Ask are off to the side… ah, the exhibit hall is rich with metaphor!).

    So, my worthless stock advice du jour: pay close attention to what Microsoft does in the near future. We’ll probably know how serious they are very quickly. Personally, if they go down the hostile acquisition route, that feels like a “bet the company” maneuver, which would surprise me greatly for Microsoft to do that.

    2/11/08
    11:00 am
    Hector’s Keynote

    I’m attending WSDM 2008 down in Stanford, CA. Lots of people from the big three (at least for now ;) ), and other usual suspects. Hector Garcia-Molina is giving the initial keynote, and has a great slide going over a number of “Holy Cow” moments. In order:

    • WWW (1993)
    • Link Search (using links to rank popularity) (1994)
    • A URL on a Billboard (1998)
    • Napster (1999)
    • “To Google” on a sitcom (2003)
    • WiFi on busses - access everywhere (2007)
    • FaceBook (2008)

    Then he had some challenges, and it’s interesting to think of how many are still viable:

    • Preservation (1993). Turns out, opening old formats (say ~5 years old) is often painful… and even if the format opens, formatting is often horribly broken. Consider old Word docs, or even WordPerfect or such. I can’t imagine what I’d do about ancient docs I may have when I wrote things on a Mac using WriteNow…
    • Digital Deterioration (1998). Sometimes, documents just get lost… or URLs go away, and so on.

    Current challenge problems (2008). Hector mentioned that this is the WSDM program, so not necessarily his list.

    • Beyond Search
    • Identifying user task / intention
    • Document/Word Semantics
  • Information Integration
    • Extraction, entity resolution
    • Combinging Results
  • Monetizing
    • Ads, bids, …
    • Spam, Click Fraud, etc.
  • Social Networks
    • modeling
    • wisdom of the crowds
  • Data Mining
    • Media Mining
    • Mining Graphs
  • Privacy
    • Safe data mining
    • Protecting identity
  • Coping with Scale
    • Power Minimization
    • Revisiting Distributed Databases
  • Personalization
    • Access to personal data
    • Tailoring services to me
  • Mobile Access
    • Small devices
    • Peer-to-peer libraries

    Hectors priorities:

    1. Beyond Search
    2. Information Integration
    3. Monetizing
    4. Social Networking
    5. Coping with Scale

    Lower Priorities:

    • Data Mining
    • Privacy
    • Personalization
    • Mobile Access

    He didn’t care so much about Privacy, as he has nothing to hide. He also doesn’t like Personalization, as he doesn’t like things that change. He opened the floor for dissenters, some people took him up on it.

    As far as hardness goes:

    1. Information Integration
    2. Beyond Search
    3. Monetizing
    4. Social Networks
    5. Privacy

    and the rest, as “easy:”

    • Data Mining
    • Coping with Scale
    • Personalization
    • Mobile Access

    I’ll see if I can’t find some time to comment on this later on tonight.

    2/03/08
    10:25 pm
    Stupid is as stupid does

    Google whines about Microsoft buying Yahoo! Emphasis mine:

    The openness of the Internet is what made Google — and Yahoo! — possible. A good idea that users find useful spreads quickly. Businesses can be created around the idea. Users benefit from constant innovation. It’s what makes the Internet such an exciting place. So Microsoft’s hostile bid for Yahoo! raises troubling questions. This is about more than simply a financial transaction, one company taking over another. It’s about preserving the underlying principles of the Internet: openness and innovation.

    Could Microsoft now attempt to exert the same sort of inappropriate and illegal influence over the Internet that it did with the PC? While the Internet rewards competitive innovation, Microsoft has frequently sought to establish proprietary monopolies — and then leverage its dominance into new, adjacent markets.

    Could the acquisition of Yahoo! allow Microsoft — despite its legacy of serious legal and regulatory offenses — to extend unfair practices from browsers and operating systems to the Internet? In addition, Microsoft plus Yahoo! equals an overwhelming share of instant messaging and web email accounts. And between them, the two companies operate the two most heavily trafficked portals on the Internet. Could a combination of the two take advantage of a PC software monopoly to unfairly limit the ability of consumers to freely access competitors’ email, IM, and web-based services? Policymakers around the world need to ask these questions — and consumers deserve satisfying answers.

    This hostile bid was announced on Friday, so there is plenty of time for these questions to be thoroughly addressed. We take Internet openness, choice and innovation seriously. They are the core of our culture. We believe that the interests of Internet users come first — and should come first — as the merits of this proposed acquisition are examined and alternatives explored.

    Kind of inflammatory, isn’t it? Well, yeah. But why do you think Microsoft whined to the SEC about Google buying DoubleClick?

    Microsoft responds:

    REDMOND, Wash., Feb. 3, 2008 – The combination of Microsoft and Yahoo! will create a more competitive marketplace by establishing a compelling number two competitor for Internet search and online advertising. The alternative scenarios only lead to less competition on the Internet.

    Today, Google is the dominant search engine and advertising company on the Web. Google has amassed about 75 percent of paid search revenues worldwide and its share continues to grow. According to published reports, Google currently has more than 65 percent search query share in the U.S. and more than 85 percent in Europe. Microsoft and Yahoo! on the other hand have roughly 30 percent combined in the U.S. and approximately 10 percent combined in Europe.

    Microsoft is committed to openness, innovation, and the protection of privacy on the Internet. We believe that the combination of Microsoft and Yahoo! will advance these goals.

    Right…. so in summary, Google doesn’t want Microsoft to buy into two Internet monopolies - instant messaging and e-mail, and use that as leverage to break into its emerging ad monopoly. And Microsoft wants to do just that.

    Get your popcorn.