selberg.org Home Home

Archive for April, 2006
4/27/06
12:30 am
Cartography made easy

Perhaps the hottest area out there in search land is maps.. or local, depending on what you want to call it. You’ve got Windows Live Local, Google Local Maps, and Yahoo! Local Maps… gotta love it. And all three teams are innovating at a tremendous pace…. for example, the aerial view on Windows Live Local is amazing… even though it apparently doesn’t go into Canada. But the border crossing looks nice.

Something I’ve tried using maps for is to plot cycling routes. It turns out, most of the mapping packages don’t work very well for this. The customer scenario is simple… given a map, put down a bunch of push-pins (or whatever you want to call them), and calculate a route from one pin to the other. Google Maps doesn’t do this; however, there are some sites like WalkJogRun.net and GMaps Pedometer that do an OK job. However, the routes generated are straight-line from Point A to Point B, but not along the curve of the road. Lame. Windows Live doesn’t do road overlays yet (come on guys, we know you’re working on it… ship it already!) nor does it do multi-point routes. And that leaves Yahoo! Local.

Here are two maps from the ride I did on Sunday, the Daffodil Classic: Planned Route vs the Actual Route. Mark, who was riding with me, and I missed a turn and thought following some guys, one of whom was wearing an Arrogant Bastard Ale jersey, was a good idea. These were actually a pain to make, as I had to put in various intersections vs. double-clicking and making a point. I also used Zillow a few times to find a house address to use as a waypoint. Also, my original version used over 25 points, which is the max for Yahoo for some reason (Live only lets you have 10 on the scratch pad… also lame).

So, here’s a shout out to everyone at Google, Microsoft, and Yahoo working on maps: make it easy to create multi-point routes! Directions to / from one place is fine, but multiple points would make these amazingly useful.

And if you could put the bike trails on the maps as well, that’d be nice too.

4/20/06
9:55 pm
AIRWeb 2006

Forgot to mention, I’m on the program committee for AIRWeb — Adversarial Information Retrieval on the Web. Should be a great workshop at SIGIR here in my home of Seattle!


Second Call for Papers (with revised deadlines)

AIRWeb 2006

Second International Workshop on
Adversarial Information Retrieval on the Web

Part of the 29th Annual International ACM SIGIR Conference on Research and
Development on Information Retrieval
10 August 2006 - Seattle, WA

http://airweb.cse.lehigh.edu/

OVERVIEW

The attraction of hundreds of millions of web searches per day provides significant incentive for many content providers to do whatever is necessary to rank highly in search engine results, while search engine providers want to provide the most accurate results.  The conflicting goals of search and content providers is adversarial, and the use of techniques that push rankings higher than they belong is often called search engine spam.  Such methods typically include textual as well as link-based techniques, or their combination.

This, the second AIRWeb workshop, builds on last year’s successful meeting in Chiba, Japan as part of WWW2005.  This year we solicit submissions on any
aspect of adversarial information retrieval on the Web.  Particular areas of interest include, but are not limited to:

    - search engine spam and optimization,
    - crawling the web without detection,
    - link-bombing (a.k.a. Google-bombing),
    - comment spam, referrer spam,
    - blog spam (splogs),
    - malicious tagging,
    - reverse engineering of ranking algorithms,
    - advertisement blocking, and
    - web content filtering.

Papers addressing higher-level concerns (e.g., whether ‘open’ algorithms can succeed in an adversarial environment, whether permanent solutions are possible, etc.) are also welcome.

Full papers are limited to 8 pages in SIGIR format; works-in-progress will be permitted 4.  At least three anonymous reviews will be provided per paper, judged on the usual basis of relevance, originality, quality, and presentation. Proceedings of the workshop will be placed online, and distributed at the workshop.  A  selection of best papers will be invited to submit expanded versions to an appropriate journal.

IMPORTANT DATES (revised!)

    5 May 2006       E-mail intention to submit (optional, but helpful)
   12 May 2006       Deadline for submissions
   12 June 2006      Notification of acceptance
   30 June 2006      Camera-ready copy due
   10 August 2006    Date of workshop

ORGANIZING COMMITTEE

   Tim Converse, Yahoo! Search
   Brian D. Davison, Lehigh University
   Marc Najork, Microsoft Research

2006 PROGRAM COMMITTEE

   Sibel Adali, Rensselaer Polytechnic Institute, USA
   Lada Adamic, University of Michigan, USA
   Einat Amitay, IBM Research Haifa, Israel
   Andrei Broder, Yahoo! Research, USA
   Carlos Castillo, Universita di Roma “La Sapienza”, Italy
   Abdur Chowdhury, AOL Search, USA
   Nick Craswell, Microsoft Research Cambridge, UK
   Matt Cutts, Google, USA
   Dennis Fetterly, Microsoft Research, USA
   Zoltan Gyongyi, Stanford University, USA
   Matthew Hurst, BuzzMetrics, USA
   Mark Manasse, Microsoft Research, USA
   Jan Pedersen, Yahoo!, USA
   Bernhard Seefeld, Switzerland
   Erik Selberg, Microsoft Search, USA
   Andrew Tomkins, Yahoo! Research, USA
   Tao Yang, Ask Jeeves/Univ. of California-Santa Barbara, USA

CONTACT ADDRESS: airweb(at)cse.lehigh.edu

4/20/06
2:30 am
Why I run Linux servers…

Hey Windows guys — this here is a huge rant. Just warning you now.

Today, I discovered my new Windows XP box had rebooted itself in the middle of the night, for some auto-update thing. Grr…. I gotta turn that off. I then decided to load in the rest that it was bugging me to do.

As always, it required a reboot. So I rebooted…

And now I can’t log in to the box.

I’ve been up way past my bedtime to fix this problem, and have been on the phone with Microsoft Helpdesk to solve it. So far, we’re busy running chkdsk.

I’ve rebooted zillions of times, launched into the recovery console, and used the Ultimate Boot Disk. I’m not sure what caused the problem. At this point, I don’t care. Linux doesn’t do this to me. Even apt-getting the entire shebang on unstable doesn’t do this to me.

I want my computer to just work. I don’t want to have it change and die… it’s just a huge waste of my time.

My Linux servers work… they’re reliable and don’t change, which is all I want from servers.

Windows guys — this is the value prop you need to hit. A stable server that doesn’t require reboots and won’t randomly nuke itself. Yeah, I know that’s what you’re striving to do, but the amount of interconnected glarp… for example, if the registry is corrupted / not there, everything grinds to a halt. With Linux, if a config isn’t there, one service won’t start. But unless you kill the right config, you can always boot.

Anyway, I’m tired and grumpy… let’s see if recovery won’t fix this problem.

Bah.

4/11/06
12:05 pm
A tale of two heartrates…

As I’ve mentioned, I’m busy doing the 20/20 Lifestyles program at the ProClub… it’s an intense diet and exercise program. I’m down to 241, from about 265 when I started. However, what’s really interesting so far is measuring some change. For example, here are two cycling rides… one was about a year ago (June) and the other way Sunday.

My 80% - 90% heart rate is 151 - 170 BPM. That’s high effort / aerobic. Past 170 is anarobic… meaning it’s not the greatest as far as improving my cardiovascular system. Heavy exertion.

Notice the spiky red lines above 160… those aren’t so good. Notice they get up to 180… that’s really not so good. Compare with Sunday, where there are some spikes to 160, but nothing too bad, and generally below 140. That’s good.

Amazing how well this is working… now to keep things up!

moz-screenshot-2.jpg
4/07/06
12:50 pm
Crashing the Gate at Microsoft

Random book blogging — I’m currently attending a book signing / talk by Jerome Armstrong and Markos Moulitsas Zuniga (of DailyKos fame), who wrote the great book Crashing the Gate. I’ll update as we go along, but here’s a pic from my cell phone I just snapped… and yeah, I gotta zoom.

Scoble is here, and he should have some better pics.

IMAGE_192.jpg

Update - the talk is more interesting than I thought… it’s very much more of advocating technology for change in the Democratic party rather than talking about problems with the party establishment (i.e. losing elections). Key things are using technology to improve communication.

Update - on to questions… unsurprisingly, very partisan. But hey, we’re in Seattle… or at least Redmond. :)

Update - a lot of issues are about getting early money to long-shot races.

4/06/06
9:45 pm
Shopping search… so crappy, and yet so useless…

As part of my eat less, move more program, I’ve gotten back into cycling. In fact, a number of the 1995 Pastry Powered T(o)uring Machines are back in action, once again doing STP - the Seattle to Portland Bicycle Ride. It’s a 200 mile 1- or 2-day trek from Seattle to, um, Portland. Kinda says it all in the title, really.

My cycling gear is about a decade or more old. My road bike is a blue 1996 Bianchi Eros, so 10 years old now, and in need of a bit of love. My good pair of riding shorts are an ancient (10+ years) pair of Bellwether shorts (held up well though) and I’m still on the last lens (amber) from my 3-lens Smith riding glasses. My helmet is over a decade old. My old Nike Poobahs (MTB shoes) are pretty destroyed and uncomfortable, and after a couple rides, even though I’m getting closer to the weight I was, I gotta say my butt was really sore in the saddle. Oh, and I have a daughter now who wants to come with me. So, time to gear up!

In Seattle, we have a wealth of quality cycling shops… such as Gregg’s Cycles (although I’ll still call ‘em Gregg’s Greenlake Cycles, which was their name before they expanded). However, if you can’t make it there because you’re working in Redmond and traffic is a nightmare, you might want to shop online.
I first started browsing for a trailer. I could always pick a Burley, which is the one everyone around here has. But on the advice of a friend (who loaned me hers to make sure Laura liked riding in the trailer), I decided to check out the competition. Eventually, I found a few good sites that told me what I wanted, but it was just so… frickin… painful. Ultimately, there aren’t a lot of reviews out there for bike trailers, so finding the 3 useful pages is in fact a bit of a chore. Luckily, it was late at night and I didn’t have much else to do, but for rare purchases things should be better. Ended up getting a Chariot Cougar I, by the way.

I then started browsing for a new saddle. Didn’t know what the brands were, what the features were, how well they’d fit my butt, and so forth. Searched and searched… and searched and searched and searched and searched…. blah. Eventually pieced together enough to have a reasonable opinion, I think, and even ordered a saddle. But not sure if it’s going to work… but hey, 30-day money-back guarantee is always nice. Did the same thing with helmets… blarg. After a while doing that, I punted and just went to Gregg’s.

The moral? Shopping search when you’re looking for a commodity, or looking for reviews on a known item (such as a movie or camera lens or whatever) isn’t a tough problem to solve. Sure, all of the engines out there favor the sales sites vs review, mostly as retail shops are all spamming optimizing themselves. However, when you don’t know much about what you want to buy, the amount of work you have to go through is enormous — to the point where just going down to a retailer is by far the fastest way to get what you want — even with 520 traffic.

There’s gotta be something better… and this is something I’m going to think on. This is a common enough user scenario, and I think with a little bit of thought we can hopefully come up with something that solves the entire experience… from researching, to comparing items, to finally purchasing one.

But for the time being… well, I’m thankful for Gregg’s and REI.