reDesign

April 15, 2008

Testing Calais

Filed under: search, seo, web 3.0 — Tags: — Rocky Agrawal @ 11:44 pm

This post is to test a new semantic tagging tool from Reuters. The information in the post may or may not be true; I’ll follow up soon with the results.

I love the Eagles’ Hotel California.

I flew United from SeaTac today after a full day meeting in Kirkland.

Lipitor has been recalled by the FDA.

California will be hit by a giant earthquake according to USGS.

The Princeville is my favorite hotel in Kauai.

The Eagles played the Jets last night.

DoubleClick is one of the largest display ad networks.

April 15 is tax day, make sure you get to the post office.

CNN, AOL, Mapquest and Entertainment are all part of the same company.

Jimbo Wales is a co-founder of Wikipedia.

December 15, 2007

Weekly reader - December 15, 2007

Filed under: flickr, intellectual property, privacy, search, weekly reader — Rocky Agrawal @ 11:49 am

Interesting reads from this week:

  • Owen Thomas freely licensed pictureWhy Lane Hartwell Popped the ‘Bubble’ Video (WIRED) - The hilarious video by the Richter Scales parodying the Web 2.0 bubble to the tune of “We Didn’t Start the Fire” disappeared from the Web after photographer Lane Hartwell filed a takedown request with YouTube. A picture she took of Valleywag’s Owen Thomas was used for a fraction of a second in the parody video. Although she’s gotten a lot of flack for it, it’s hard to fault her for protecting her rights as a photographer.
    The issue also brings up the challenge that the Web and amateurs pose for people like Hartwell. It’s easy to steal commercial content and it’s easy to find freely usable “good enough” amateur content. When flickr makes it so easy to find pictures that you are free to use, why go through the hassle of improperly using a commercial image? (The picture in this post is a Creative Commons image from Telstar Logistics.)
  • Amazon Ordered to End Free Delivery on Books in France (IDG News Service) - The French high court rules that Amazon is selling books too cheaply because free shipping constitutes an illegal discount. Under French law, booksellers can’t discount a book by more than 5% off the list price. (Discounts of 50% on bestsellers aren’t uncommon in the United States.) via Dave Smith
  • Search: 2010 - A Review (WebProNews) - A look at the future of search with Marissa Mayer from Google, Larry Cornett from Yahoo!, Justin Osmer from Microsoft and Daniel Read from Ask. More of the usual stuff. Usability consultant Jakob Nielsen speaks of moving from “relevance” to “usefulness” to evaluate search. Good luck measuring that. via Jim Simmons
  • Dodd Challenges Google to Provide Leadership in the Digital World (WIRED) - Presidential Noshot Chris Dodd speaks at the Google campus about providing leadership in the new information driven economy. He chastens Google for their approach to China and encourages them to stand up to governments (including our own) when they seek to trample the rights of their citizens.

December 13, 2007

Yahoo! Local gets Yelpy

Filed under: advertising, city guides, local search, search, web 2, web 2.0, yahoo, yelp — Rocky Agrawal @ 2:09 pm

Yahoo! Local has rolled out some new features to increase the Web 2.0-ness of its local search product:

  • RSS feeds. You can subscribe to feeds of all reviews near you. If you find a reviewer you like, you can stay up-to-date on his or her reviews.
  • A “first reviewed by” designation to highlight contributors who are the first to review a place.
  • Attribute drill down. You can narrow your search using filters such as “family friendly,” “casual” or “elegant.”

It’s been a few months since I last checked in on Yahoo! Local. Overall, it’s a huge improvement. It has a long ways to go before catching category leader Yelp. (The metric being by my subjective opinion of product quality.)

Yelp has had the first two features for at least a year.

Among the local players, Yelp has had the best incentive system for contributors. Its “First to Review” designation is one of many things that Yelp does to encourage frequent participation. An “Elite” system rewards frequent contributors with a badge on their profile and invitations to parties. The front page of the site highlights a review of the day. Featured Yelpers also appear on the home page.

It may sound corny, but such incentives are important to keeping people engaged. Most social systems have some sort of perk system, including ODP’s edit-alls and metas and the Wikipedia cabal.

Although Yahoo’s design is more visually appealing than it used to be, it’s still cluttered.

Unlike Yelp, the map scrolls off the search results page, making it hard to see where results 3-10 are located unless you have a very large screen.

Getting reviews is more work than it should be. Yahoo! breaks its 69 reviews for The Italian Store across 29 pages, 3 at a time. Yelp shows all 42 of its reviews on one page, making it very easy to scan.

Then there’s the ads. I’m all for ads — I work in the Web space and like to get paid — when they’re relevant. The ads on Yahoo! Local are anything but. Here is an example of the ads that appeared above the listings for restaurants:

Irrelevant ads on Yahoo! Local

The top two ads are for services that compete with Yahoo! Local. Ads on the side (not shown) pitched “Watch mouth-watering videos of Oklahoma’s best restaurants” and one from Target offered “Find restaurant online. Shop & Save at Target.com Today.” (I’ll admit to clicking through on the Oklahoma ad just to see what would constitute a mouth-watering video of Okahoma restaurants. Unfortunately, they linked it to a video of a bad rendition of Rudolph the Red-Nosed Reindeer.)

I understand that local advertisers are scarce, especially outside the Bay Area. But Yelp takes the right approach. If you don’t have something interesting to say, keep your mouth shut.

More on: local search, yahoo, yelp

Disclosure: I used to work on local products for AOL.

December 11, 2007

Searching for a search engine that understands deep dish pizza

Filed under: local search, mobile, mobile search, search, wireless — Rocky Agrawal @ 8:23 pm

Update: If you’re looking for deep dish pizza near O’Hare, see my step-by-step guide to Gino’s East on Higgins.

Having gone to school in Chicago, I love deep dish pizza. Unfortunately, there’s no Carmen’s or Giordano’s in the D.C. area. The last time I had good Chicago-style pizza was when my friend Jason flew in a few Giordano’s pies for his Super Bowl party. (The Colts were are also represented with tenderloins.)

Jason with Giordano’s pizza

I was connecting through O’Hare today and wanted to get some deep dish at the airport. I asked Google for “deep dish pizza at o’hare”. No luck.

This is a really difficult query for search engines. It seems simple, but it has a lot of components that make it tricky. But it’s exactly the kind of query that search engines should be able to handle.

Breaking apart the components of the query, we have:

“deep dish pizza” is a distinct concept. It’s different from “New York pizza,” “Sicilian pizza,” and “Indiana pizza”. (I don’t know what that is, but my friend Wanita swears there’s such a thing.) I could restrict my query using quotation marks around the phrase “deep dish pizza” but I shouldn’t have to do that. On the other hand, “deep dish pizza” is close enough to “Chicago-style pizza” that those results should be included.

The second part of my query was “at”. Search engines typically treat words like “at” “and” “near” and “or” either as filler and ignore them, or they use them as Boolean operators. There’s a big difference between the query “deep dish pizza at o’hare” and “deep dish pizza near o’hare”. With 90 minutes between flights, “near” doesn’t work.

“O’Hare” is also tricky. It’s a known place with a physical address. But Google and other search engines know it as ORD or 10000 Bessie Coleman Dr, Chicago, IL 60666. Compare the results for “deep dish pizza o’hare” with those for “deep dish pizza ORD“. Frequent travelers might shortcut to ORD, but again, that’s not a burden users should have to bear.

The answer, in theory, lies in natural language search. I’ve written before about how search engines force people to think like computers. Natural language search tries to teach computers to think like people. The most talked about company in the space is Powerset. I saw a controlled demonstration of their technology in August, but the promised fall public beta has yet to materialize.

Keyword-based search engines fake some of this by using tricks like stemming, synonyms and anchor text. With the uptake of sites like Yahoo! Answers and the sheer volume of information on the Web, there’s a decent chance that someone has phrased the question the same way. In the search results page for my original query, one of the results was a Frommer’s Q&A.

In addition to the structural challenges of queries like this, there’s also the challenge of how data is gathered. Data providers do a terrible job of gathering information about a place that’s really a collection of places — such as malls and airports. In some cases, information is simply not collected. In others, the information that is collected isn’t sufficiently descriptive. The physical addresses of these businesses aren’t meaningful to users. “Terminal 1, Gate C3″ makes sense; 10000 Bessie Coleman Dr, Chicago, IL 60666 does not.

OK, how many geeks are pulling out their laptops and doing searches like this you ask? Not a lot. And in search from the Web, it’s relatively easy to re-do the query and keep tweaking it until you get an answer.

Getting better answers faster becomes increasingly important as search moves to mobile devices and with voice-based search from the likes of Tellme and Google’s GOOG-411. In those environments, the penalty for failure is much higher. Users can’t easily tweak queries. They can’t browse endless Web sites to try to get the answer. They need the algorithms to do the work for them.

I was finally able to find out about pizza options at O’Hare by going to the O’Hare Web site and looking at a PDF map of Terminal 1. There isn’t a deep dish pizza place in Terminal 1, though there are Pizzeria Unos in other terminals.

The pyschic search engine would know that Pizzeria Uno is not an answer that works for me.

More on: local search, search, wireless

November 19, 2007

Searching outside the search box

Filed under: facebook, search, social networking, web 2, web 2.0 — Rocky Agrawal @ 11:24 am

A large untapped opportunity in social networks is connecting people with information they’re looking for.

I was flying home this weekend from Lake Tahoe and connected in Salt Lake City. While I was there, I updated my Facebook status to indicate that I was in Utah for the first time.

Later that night I received a message from my friend Dean:

hey Rocky, whatcha doing in the beautiful, bizarre state of UT?

I lived there for a year after AOL. Let me know if you need any tips on where to go while you are there.

Salt Lake City AirportWithout doing a search, I had information coming directly to me from someone I knew. I was just in Utah for 90 minutes, so I didn’t need any tips. But when I go there for real, I now know to begin my search with Dean.

By distributing information needs through our network, social networks allow us to tap into a large base of knowledge from known sources.

Services like Yahoo! Answers allow you to ask questions, but Answers is largely anonymous. Too many of the answers devolve into insults and name calling and it’s hard to tell if people know what they’re talking about. There is also an incentive problem: I don’t participate in Yahoo! Answers because I don’t have enough time to answer questions for random strangers. But I’m happy to answer questions for friends.

LinkedIn’s Answers product usually delivers better results by posing questions just to your network. And because I know these people, I can easily assess the credibility of their answers. LinkedIn’s professional focus is a bit limiting; I wouldn’t pose questions about vacation plans there.

If I were really going to Utah, I suppose I could spam everyone I know with an email asking if anyone knew anything about Utah. The passive approach of updating my Facebook status is more socially acceptable.

For now, this relies on my friends seeing my status message and responding. It was more or less random that Dean saw my status message. As social networks get smarter (and get more data), the request can be routed automatically to the people likely to have a good answer. My status message could be displayed more prominently to friends whose profiles indicated that they’d lived in or visited Utah.

Marketers can also be part of the conversation. Facebook allows you to become a “fan” of a company or a product. If I become a “fan” of United Airlines, they could send me information about their Utah service or upcoming sales. I’d love to hear about any great deals to Park City this winter.

More on: Facebook, search

See also:

September 2, 2007

Google News starts hosting wire service content

Filed under: google, journalism, media, newspapers, search — Rocky Agrawal @ 9:24 am

In more bad news for the newspaper industry, Google is now starting to host its own versions of content from the Associated Press, Agence France-Press and UK and Canadian wire services.

Here’s a screen shot of an AFP story:

AFP story hosted by Google News

Yahoo! and AOL have been doing this for years. Here’s the same story on Yahoo!

Google had been linking off to versions of wire service stories published by other media outlets, driving traffic to those sites.

For readers, this is a good thing. In most cases, news outlets have subtracted value from wire stores: making their own edits (sometimes introducing errors), cluttering stories with lots of irrelevant ads, splitting stories across multiple pages. The versions of wire stories on newspaper sites are sometimes shorter because they were cut to fit the space available in the paper. Readers also have had to deal with various UIs depending on what site they get sent to.

Now readers get complete stories in a consistent format. In typical Google form, the layout is simple. You get the story, any related pictures and links to related stories. The entire contents of the article are one page. Compare the same story about Mexican truck programs on Google News, My San Antonio, ABCNews.com, the Houston Chronicle and the Denver Post. The Google News page is by far the cleanest and loads the fastest.

The move to host wire content is married with better duplicate detection. This dramatically reduces the House of Mirrors effect. Readers won’t see the same wire story 300 times in the results. This makes it easier for readers to find other voices on a topic.

I haven’t seen advertising on these pages yet, but now that Google seems to be licensing the content, it seems inevitable.

More on: Google, journalism, newspapers

Further reading:

August 29, 2007

Googling all the news that’s fit to correct

Filed under: journalism, media, newspapers, research, search — Rocky Agrawal @ 5:12 pm

The New York Times public editor writes this week about an increasing problem: incorrect information from the Times that lives forever in search engines. The Times has started surfacing its archived content in a way that search engines can crawl. This presents problems for people who were treated unfairly.

The Times tells the story of Allen Kraus, a deputy commissioner for the New York City Human Resources Administration. The Times reported that he resigned under pressure because of an investigation. If you Google “Allen Kraus” today, the second link is for the Times archive with stories titled “6 Held in Welfare Fraud Scheme; Inquiry Uncovered Worker Bribes” and “A Welfare Official Denies He Resigned Because of Inquiry“.

That’s not the best impression to make on a prospective employer or client.

People are coming forward at the rate of roughly one a day to complain that they are being embarrassed, are worried about losing or not getting jobs, or may be losing customers because of the sudden prominence of old news articles that contain errors or were never followed up. … Kraus is hardly alone in claiming real or potential harm. A person arrested years ago on charges of fondling a child said the accusation was false and the charges were dropped. The Times reported the arrest but not the disposition of the case.

The Times says that if they worked to correct errors, that’s all they would be doing:

But what can they do? The choices all seem fraught with pitfalls. You can’t accept someone’s word that an old article was wrong. What if that person who was charged with abusing a child really was guilty? Re-report every story challenged by someone? Impossible, said Jonathan Landman, the deputy managing editor in charge of the newsroom’s online operation: there’d be time for nothing else.

Although Wikipedia is often slammed for having inaccurate information, at least with Wikipedia, you’ve got a more than fair chance of getting an error corrected. Bloggers are generally more than willing to fix genuine errors — and they are much more approachable than the Times to point them out in the first place.

A big part of the problem is that many newspapers consider their archives to be a permanent record of what was in print. Newspapers view themselves as the first draft of history.

Removing anything from the historical record would be, in the words of Craig Whitney, the assistant managing editor in charge of maintaining Times standards, “like airbrushing Trotsky out of the Kremlin picture.”

In newsrooms, the archives are called “the morgue.” Now that the Times is bringing those dead stories to life, a different approach is needed. In the past, finding old newspaper articles required you to search through microfilm, dusty newspapers or expensive commercial databases made it clear that you were looking at something old. With one-click Googlability, ancient articles look as fresh as something published minutes ago.

Some news outlets still show the incorrect version of a story with a footnote at the bottom showing the correction. That is absolutely wrong. If you make an error and you know it, fix it where the error was made, not someplace people might not get to. (Especially if they’re just seeing an excerpt on a search results page.) The first thing someone should come across online is the best version the newspaper can offer.

Bloggers do this all the time. When an error is pointed out, they correct the main blog entry and usually indicate that a previous version contained incorrect information. If the error was pointed out in the comments, they usually point readers at the comment.

For newspapers worried about the “permanent historical record”, that can be maintained as a separate link off the current page with a prominent disclosure that shows that the article has been superseded.

The archives of the Times presents another problem: you have to pay to see the archived story. The free headline and excerpt might say “John Smith arrested in connection with fondling neighbor’s child”. But someone would have to pay $4.95 to see the update that says “The Times incorrectly reported the name, it was Justin Smith.”

August 20, 2007

comScore redefines search, Google wins bigger

Filed under: aol, facebook, google, metrics, search, statistics, yahoo — Rocky Agrawal @ 3:47 pm

ComScore is changing the methodology for its qSearch market share ratings. Instead of just counting search activity at the major search engines, comScore is expanding the definition of search to include searches at sites such as Wikipedia, eBay, Amazon, MySpace, Mapquest, Craigslist and other vertical players.

Searches across multiple tabs for the same search term will also be counted separately. For example, if you search for “hurrican dean” in Web search and then click the tabs for news and pictures, that will be counted as three searches.

For those who were hoping this might shrink Google’s share of search, think again. Under the new methodology, Google’s share grew 6 points in March compared with the old methodology. The additions to Google (which include YouTube) are greater than all of TimeWarner’s search traffic (which itself benefits greatly from the addition of Mapquest).

Here is a comparison of core search and expanded search metrics based on July 2007 data:

Core search Expanded search
  1. Google
  2. Yahoo!
  3. Microsoft
  4. Ask
  5. Time Warner (AOL Search)
  1. Google (Google, YouTube)
  2. Yahoo!
  3. Microsoft
  4. Time Warner (AOL Search, Mapquest)
  5. Fox Interactive (MySpace)
  6. eBay
  7. Ask
  8. Craigslist
  9. Amazon
  10. Infospace

Using the expanded definition, Ask drops from #4 to #7, being passed by TimeWarner, Fox Interactive Media (MySpace) and eBay. TimeWarner moves up from #5 to #4, based largely on Mapquest traffic.

The numbers don’t seem to include Facebook, which according to its blog does more than 600 million searches a month. If that number were comparable to qSearch data, Facebook would be at #5 in the expanded search.

More on: AOL, Google, Yahoo!, Facebook.

Disclosure: I used to work at AOL Search.

August 9, 2007

Writing news for search engines and blogs

Filed under: blogs, journalism, media, newspapers, search, seo — Rocky Agrawal @ 10:47 am

One of the reasons I love blogging is that it gives me the opportunity to see things at a micro level. I can see patterns and analyze data in a way that I couldn’t in a typical work role.

When I was writing the follow on post to the 35W bridge collapse the other day, I initially wrote this:

MN-DOT has finally released the video from last week’s bridge collapse.

That’s how I would have written it based on my journalism experience. As I tapped out the period, I realized that the sentence is meaningless to search engines. And thus unfindable by the many users who rely on search engines to find news. I rewrote it as:

MN-DOT has finally released the video from the August 1 collapse of the Interstate 35W bridge over the Mississippi River.

That sounded too stilted to me. Based on having looked at my traffic data, I knew people weren’t searching on “August 1″. They were searching heavily on “Mississippi” and “35W”. The final version I used is this:

MN-DOT has finally released the video from last week’s collapse of the Interstate 35W bridge over the Mississippi River.

Writing headlines for blogs is even trickier. Blog headlines have two audiences: search engines and readers who view blog posts in RSS feeds. The clever headline that might get a reader to click on a link is often lacking in the keywords that search engines need.

This headline from Tuesday is meaningless to search engines: “Mmmm… McCarrots and McMilk.” It seems to be working from an RSS feed clickthrough perspective.

I strive for a mix of people friendly and search engine friendly headlines. When I use headlines like the McCarrots and McMilk one, I do an extra pass to make sure that the body of the post contains the keywords searchers are looking for.

Speaking of search engines and news, this week drove home a pet peeve: news sites like CNN defaulting their search engines to search the Web. If I wanted to search the Web, I would have gone to Google, used the Google search box in Firefox or used the search box in the Google Toolbar. If I’m searching on your news site, I want your news content.

CNN Search box

I got some traffic to this blog from people searching CNN for “35W bridge traffic camera video”. That search led them to this post on the video being released. Which led them back to the video on CNN’s site.

As much as I welcome the traffic, it’s a terrible user experience. A hybrid model, where CNN content comes before Web results, would be more effective and still serve the revenue goals of offering Web search.

See also: Taking newspapers beyond tonight’s fishwrap

More on: newspapers, journalism

July 9, 2007

Nielsen tears up page view metrics

Filed under: YouTube, google, search, statistics, yahoo — Rocky Agrawal @ 3:15 pm

Hallelujah! From the AP story on Nielsen’s move:

A leading online measurement service will scrap rankings based on the longtime industry yardstick of page views and begin tracking how long visitors spend at the sites.

The move by Nielsen/NetRatings, expected to be announced Tuesday, comes as online video and new technologies increasingly make page views less meaningful.

In my post on creating killer products, I mentioned avoiding page view metrics.

In today’s Web world, they’re a terrible measure of user engagement. A user who spends 10 minutes watching a video or 15 minutes engaged in a flash game counts the same as a user who hit your site by accident from a search engine.

Even for non-multimedia experiences, chasing page views can create terrible user experiences. Consider some examples:

  • Splitting news stories onto separate pages. Each page of the story counts as a page view. Not only is the paging annoying to the user, it hurts the way your pages are indexed making it harder for people to find your content in search engines. It’s been a while since I worked in the news business, but I’d love to see what the drop offs are at each page.
  • Pointless confirmation pages. Many sites take you to confirmation pages just so they can count the additional page.
  • Popups/popunders/etc. I read a story while back about a publisher using popup- and popunder-ads to pump their page view numbers.

Ignoring page view metrics has created some great experiences:

  • Google Maps. If you drag the map around the screen, you count as one page view. But to the user this is much easier and a much better experience than the old model of clicking an arrow on the side of the screen and waiting for the page to refresh.
  • YouTube’s embedded videos. If they’d been chasing page views, they never would have allowed users to embed videos on their own blogs.
  • Yahoo’s streaming quotes. You don’t have to refresh the page to see the latest stock price.
  • Flash-based instant messaging. You can chat with your friends without having to download and install a special client.

Not only are they much better experiences, it can save the publisher money. It takes less bandwidth and processing power to send down just the updated information than it does to generate and send an entire page.

Nielsen’s time-on-site measurement is an improvement over page views, but as with any single measurement it can be gamed. The best managers will look at a range of metrics specific to their situation.

Older Posts »

Blog at WordPress.com.