selberg.org Home Home

Archive for February 11th, 2008
2/11/08
4:25 pm
GSSP at WSDM

Right now, there’s about 40 +/- 10 people from Microsoft, Yahoo, and Google in close proximity, along with other associated industry folks. So, as you can imagine, there’s a lot of scuttlebutt about the Microsoft / Yahoo acquisition. Here’s some of it (names withheld to protect the guilty… and hey, this is GOSSIP! Unsubstantiated rumor! Yellow journalism! Cite it at your own risk!)

  • A bunch of Microsofties are very wary of the deal, for reasons largely along the lines I mentioned previously.
  • A bunch of Yaholligans (Yahoochers?) are “just not thinking about it.”

Neither side is saying they saw this coming (as opposed to search pundits everywhere). And, as near as I can see, everyone is just going about as normal. The Yahoo recruiting booth is sandwiched right between the Google and Microsoft booths at WSDM (eBay and Ask are off to the side… ah, the exhibit hall is rich with metaphor!).

So, my worthless stock advice du jour: pay close attention to what Microsoft does in the near future. We’ll probably know how serious they are very quickly. Personally, if they go down the hostile acquisition route, that feels like a “bet the company” maneuver, which would surprise me greatly for Microsoft to do that.

2/11/08
11:00 am
Hector’s Keynote

I’m attending WSDM 2008 down in Stanford, CA. Lots of people from the big three (at least for now ;) ), and other usual suspects. Hector Garcia-Molina is giving the initial keynote, and has a great slide going over a number of “Holy Cow” moments. In order:

  • WWW (1993)
  • Link Search (using links to rank popularity) (1994)
  • A URL on a Billboard (1998)
  • Napster (1999)
  • “To Google” on a sitcom (2003)
  • WiFi on busses - access everywhere (2007)
  • FaceBook (2008)

Then he had some challenges, and it’s interesting to think of how many are still viable:

  • Preservation (1993). Turns out, opening old formats (say ~5 years old) is often painful… and even if the format opens, formatting is often horribly broken. Consider old Word docs, or even WordPerfect or such. I can’t imagine what I’d do about ancient docs I may have when I wrote things on a Mac using WriteNow…
  • Digital Deterioration (1998). Sometimes, documents just get lost… or URLs go away, and so on.

Current challenge problems (2008). Hector mentioned that this is the WSDM program, so not necessarily his list.

  • Beyond Search
  • Identifying user task / intention
  • Document/Word Semantics
  • Information Integration
    • Extraction, entity resolution
    • Combinging Results
  • Monetizing
    • Ads, bids, …
    • Spam, Click Fraud, etc.
  • Social Networks
    • modeling
    • wisdom of the crowds
  • Data Mining
    • Media Mining
    • Mining Graphs
  • Privacy
    • Safe data mining
    • Protecting identity
  • Coping with Scale
    • Power Minimization
    • Revisiting Distributed Databases
  • Personalization
    • Access to personal data
    • Tailoring services to me
  • Mobile Access
    • Small devices
    • Peer-to-peer libraries

    Hectors priorities:

    1. Beyond Search
    2. Information Integration
    3. Monetizing
    4. Social Networking
    5. Coping with Scale

    Lower Priorities:

    • Data Mining
    • Privacy
    • Personalization
    • Mobile Access

    He didn’t care so much about Privacy, as he has nothing to hide. He also doesn’t like Personalization, as he doesn’t like things that change. He opened the floor for dissenters, some people took him up on it.

    As far as hardness goes:

    1. Information Integration
    2. Beyond Search
    3. Monetizing
    4. Social Networks
    5. Privacy

    and the rest, as “easy:”

    • Data Mining
    • Coping with Scale
    • Personalization
    • Mobile Access

    I’ll see if I can’t find some time to comment on this later on tonight.