selberg.org Home Home

Hector’s Keynote

I’m attending WSDM 2008 down in Stanford, CA. Lots of people from the big three (at least for now ;) ), and other usual suspects. Hector Garcia-Molina is giving the initial keynote, and has a great slide going over a number of “Holy Cow” moments. In order:

  • WWW (1993)
  • Link Search (using links to rank popularity) (1994)
  • A URL on a Billboard (1998)
  • Napster (1999)
  • “To Google” on a sitcom (2003)
  • WiFi on busses - access everywhere (2007)
  • FaceBook (2008)

Then he had some challenges, and it’s interesting to think of how many are still viable:

  • Preservation (1993). Turns out, opening old formats (say ~5 years old) is often painful… and even if the format opens, formatting is often horribly broken. Consider old Word docs, or even WordPerfect or such. I can’t imagine what I’d do about ancient docs I may have when I wrote things on a Mac using WriteNow…
  • Digital Deterioration (1998). Sometimes, documents just get lost… or URLs go away, and so on.

Current challenge problems (2008). Hector mentioned that this is the WSDM program, so not necessarily his list.

  • Beyond Search
  • Identifying user task / intention
  • Document/Word Semantics
  • Information Integration
    • Extraction, entity resolution
    • Combinging Results
  • Monetizing
    • Ads, bids, …
    • Spam, Click Fraud, etc.
  • Social Networks
    • modeling
    • wisdom of the crowds
  • Data Mining
    • Media Mining
    • Mining Graphs
  • Privacy
    • Safe data mining
    • Protecting identity
  • Coping with Scale
    • Power Minimization
    • Revisiting Distributed Databases
  • Personalization
    • Access to personal data
    • Tailoring services to me
  • Mobile Access
    • Small devices
    • Peer-to-peer libraries

    Hectors priorities:

    1. Beyond Search
    2. Information Integration
    3. Monetizing
    4. Social Networking
    5. Coping with Scale

    Lower Priorities:

    • Data Mining
    • Privacy
    • Personalization
    • Mobile Access

    He didn’t care so much about Privacy, as he has nothing to hide. He also doesn’t like Personalization, as he doesn’t like things that change. He opened the floor for dissenters, some people took him up on it.

    As far as hardness goes:

    1. Information Integration
    2. Beyond Search
    3. Monetizing
    4. Social Networks
    5. Privacy

    and the rest, as “easy:”

    • Data Mining
    • Coping with Scale
    • Personalization
    • Mobile Access

    I’ll see if I can’t find some time to comment on this later on tonight.

    Leave a Reply