I’ve long advocated that news organizations geotag the news. But I’ve been skeptical of automated systems for doing this. Google News recently provided a terrific example of what can happen when you use entity extraction for such a task:
In this case, reported by Valleywag, Google is comically wrong. But even when Google is roughly right, the map is often there just for the sake of having a map. The location information is often not very precise or isn’t really relevant.
For example, this story about a Yankees game puts Yankee Stadium somewhere near City Hall. Stories about national issues are often datelined New York or Washington because the reporter happens to be sitting in one of those two cities.
For individual story pages, an inaccurate map isn’t the worst thing in the world. But when you plot many of these stories on a map, they become worthless. In Google Earth, you can get a layer that provides geotagged news from The New York Times. I’ve seen pointless geotagging such as a story titled “U.S. Moves Toward International Accounting Rule” geotagged as being in the “USA”. (Which Google Earth plots in Oklahoma.)
There are many cases where geocoding makes sense and provides users a real service:
- Restaurant reviews
- Crime stories
- Event listings
- Travel stories
In each of these cases, the location is a critical part of the story. The minimal extra effort involved in geotagging these stories would significantly increase their shelf life and usability.