Using GeoNames for Genealogical Research

EuropeWe’ve talked about the importance of normalizing place names in your family tree. You may find the same state name written “Massachusetts”, “Massachusetts, USA”,”Mass.”,  or just  “MA”, and that’s just the beginning. Within states there are counties, boroughs, cities, towns, and local geographic references. The same places will frequently be referred to differently in different records, and it is important to know when two records refer to the same geographic location. There are several reasons for this, the most obvious being that you need to know whether the records refer to the same person. As an example, I have a grand aunt who seemed, according to some records, to have died in Florida. I was pretty sure this was not the case, but thought there was a small possibility that she could have moved there late in life, and I didn’t know. It turns out that another woman with the same first and last name (married name) who was born on the very same day, did die and was buried in Florida. But my grand aunt was buried in Utah, just as I suspected. In this case, the place names were so different that Citrus, Florida stuck out like a sore thumb, and I didn’t miss it. But it could have been different. What if my aunt were buried in a different part of Florida, possibly somewhere with a name I didn’t know? That’s the kind of discrepancy that it would be easy to miss, and I very well could have added quite a bit more inaccurate data to my family tree before the error was discovered.

Software can be very helpful in working with place names, but it can also make us vulnerable to other errors. As a simple example, I once encountered references to a colonial ancestor living in Germany. How could that be? In case you haven’t already guessed, she lived in Delaware, and someone recorded it as DE, the standard U.S. state abbreviation for Delaware. But DE is also the International 2-letter code for Germany (Deutschland in German). International standards (such as ISO 3166 for country codes) are indispensable in representing place names unambiguously and, if you’re like me, you use state abbreviations without giving it much thought. Unfortunately, computer applications often simply digitize paper forms, and since people have been writing addresses on one or a few lines for ages, computer programs tend to provide so-called free text fields for place names. And that’s where the trouble starts. In order to compare place names software needs to parse these fields into their constituent components.This is often done heuristically, so “Dover, DE” will be correctly interpreted as Dover, Delaware, and “Hamburg, DE” will correctly be interpreted as Hamburg, Germany. But in this case, something went wrong. Most likely, the software was unable to identify the name of the town or settlement after consulting a geographic database, so it fell back on the interpretation of DE as referring to Germany. The moral of the story is that you should always manually review place names before storing them in your family tree or database.

Of course, there is a problem, there are a lot of place names, and no matter how extensive our geographic knowledge, we are likely to encounter names we don’t recognize. Worse, we may think a place name is correct or complete when, in fact, it is not. This is where tool support comes in. Popular genealogy applications such as Family Tree Maker include geographic databases and include tools allow you to validate and correct place names. But what if you don’t use one of these tools? GeoNames is an open source database (licensed under the Creative Commons 3.0 Attribution license) that includes over eight million records, and it is freely available on the web. You can either use the web based interface, download the database or make use of the web service interface. Most of the time, you’ll probably want to use the form on the web site to look up place names using your browser. The other options are primarily of interest to application developers.

So, how does it work? Let’s suppose that the place name Adwick Le Street, Yorkshire, England is unfamiliar to us, or we are unsure it is spelled correctly. Head over to GeoNames at http://www.geonames.org and type “Adwick Le Street” in the search box, and select “United Kingdom” from the drop down box to the right of it. Press Search. You will see something like this

2 records found for “Adwick Le Street”
Name Country Feature class Latitude Longitude
1 P Adwick le Street  wikipedia article
United Kingdom, England
Doncaster > Brodsworth
populated place N 53° 34′ 14” W 1° 11′ 4”
2 S Adwick le Street Castle Hills
United Kingdom, England
Doncaster
castle N 53° 33′ 14” W 1° 10′ 9”

In this case, there is no need to use the advanced search option. If you like, you can click on the hyperlink to see Adwick Le Street on a map. This can help to resolve apparent ambiguities. For example, in the case of my ancestor John Woodhouse, I had seen him described as living in Doncaster as well as Adwick Le Street. As it happens, Doncaster is the nearest town. That tells me that I’m not looking at two place names (well, distant ones, anyway), and I don’t have to worry about having made a mistake. But if one source said he was born in London and another in Doncaster, I would have a problem, and would need to do further research to resolve the ambiguity.

Of Icebergs and the Internet

iceberg

The use of online databases and search tools in family history research has provoked a kind of backlash among more traditional genealogists. It is often said that the Internet is just the tip of the iceberg with real research taking place in libraries, family history centers and archives. And it really is true that only a fraction of the genealogical data available can be found online. That is changing, of course, more information becomes available online every day, and the amount of data available now was undreamed of a few years ago, but creating new digital repositories is no easy task, and it’s not free. So, for the foreseeable future, we should expect family history to involve working with microfilm, reference works, and even physical papers stored in libraries, churches and private collections.

But the value of digital libraries should not be underestimated, they really have revolutionized genealogical research. In part, I think, there is a kind of nostalgia for traditional methods and archives, and it is thoroughly understandable. But depending on whether you identify more strongly with the digital camp or the traditional camp, you may find yourself either exaggerating or understating both the sheer amount of information available in digital form and the relative comprehensiveness of that information. A bit of explanation is in order here: no matter how much data is available online, if the information you’re looking for is not available, it won’t matter (to you) how much information there is out there that you can download using just a web browser and an Internet connection. Comprehensiveness is the degree to which an archive or digital repository includes all of the data you might need, and not just certain resources, or data of a particular type. Right now, comprehensiveness is the Achilles heel of digital repositories. Sooner or later, you’re going to find yourself needing data that hasn’t been digitized and indexed or documents that haven’t been scanned or photographed. Sure, there will be plenty of data out there to keep you busy, but there will always be those questions that remain unanswered until you start digging into special collections at the library, or spend some time ordering and reviewing microfilm at your local family history center.

[Read more…]