OpenRefine

OpenRefine looks like a spreadsheet, but is actually “a powerful tool for working with messy data.” We’ll use OpenRefine for normalizing place names from imprint statements in English Short Title Catalogue (ESTC) records and for fetching latitude and longitude coordinates for those places, but you may find it useful for lots of other things, too.

OpenRefine can be an especially useful tool if you know a little bit about regular expressions or can do some basic scripting (using variables, control flow statements, etc.). You don’t need to know those things to get good use out of the tool, but if, as you proceed in your project, you think you’re going to have a lot of data cleaning to do, it could be worth learning something about them (we’ll help).

Please download and install OpenRefine 2.6-rc2 for your platform (it’s available for Windows, macOS, and Linux). OpenRefine runs as a service on your machine—it doesn’t have its own interface, but rather opens a tab or window in your web browser. When you close that tab or window, OpenRefine continues running (at least on macOS), so you’ll want to remember to explicitly quit OpenRefine when you’re done.