Setting Up Your Own Geocoder Part 1

Please note, part 1 covers all the software requirements to install the Pelias geocoder. The actual installation of the Pelias geocoder will be covered in part 2.

If you work in the area of GIS, you eventually have to do your own geocoding. Now before we begin, I will forewarn you that this isn’t for the faint of heart. And depending on what dataset you choose to start with, this could be a multiday process. But if you’re like me, and you realize that geocoding with ESRI can quickly chew through credits, you’ve begun to look for alternatives. Well my friend, I’m about to walk you through how to do that.

Before I begin though, I must mention that there are two primary locally hosted geocoding options for you. Perhaps one of the most well known options is Nominatim. Nominatim is the geocoder used by Open Street Maps and runs off of PostgreSQL and PostGIS. You’ve probably even used it before, not realizing you were using Nominatim. For many python geocoding packages and tutorials, nominatim is the default geocoding service. In GeoPy’s walkthrough, they demonstrate how to use GeoPy using nominatim. The second option is Pelias. I’ll be walking you through Pelias today.

Now why am I choosing Pelias over Nominatim? Pelias offers support out of the box for Linux and MacOS. Nominatim only offers support for Linux. In the documentation, Nominatim does note that third party users have created instructions for other frameworks, but do not offer support (generally in my experience these work well, but I did not go this route for my geocoding). Pelias also runs through Elastic instead of PostgreSQL. What does this mean? It is much faster (though PostgreSQL is catching up). See this stack overflow thread for further discussion.

If speed and computational efficiency is your concern, its a no-brainer that you have to run this on Linux. It’s been installed on Windows, but thats more work than I care to do. If you have an IT team, you’ll need to ask them to provision you with a Linux server (I recommend Ubuntu 22.04 Jammy Jellyfish). I would ask for at 500 GB of storage space. More if you want the entire planet. For my needs, I really only needed the state of Colorado, but I decided to do all of North America just because I had the space. If you’re doing this for a personal project on your own equipment, I would recommend setting up a Linux virtual machine on your device. At this time, I can’t recommend any guides, but a Google search should set you up accordingly.

I will preface this by saying, I am not a Linux administrator and have spent the bulk of my life working in Windows or MacOS. The next series of instructions covers some issues that I ran across working in Linux. If you’re more proficient in Linux than I am or you have a Linux administrator who is working with you, it’s likely you won’t run into these issues. However, my IT team is a purely Windows team so I was on my own once they set up the Linux server. You will need to make sure that you have administrative privileges.

When a Linux server is first installed, it is a command line terminal only. If you want a desktop environment gui, you will need to install it. I don’t think its super necessary, but it can help you to understand Linux’s file structure. Begin by opening up the terminal and issue the following commands.

sudo apt update
sudo apt install tasksel

The sudo command gives you root/administrative privileges. apt is a command line tool that allows you to install and update packages. tasksel will allow you to run a grapical desktop environment. There are multiple desktop environments available. If you have no particular preference, I’d just go for the default environment called GNOME.

sudo taskel install ubuntu-desktop

taskel will download and install the necessary packages. During the installation, you may be prompted to select a default window manager for the system. There are pros and cons to each display manager, but gdm3 is a good default choice if you have no particular preference. Reboot your system using the command reboot.

In case your desktop engine doesn’t start and you’re back at the terminal, you just need to enter a command to make sure your system boots into the graphical engine. You can do this with:

sudo systemctl set-default graphical.target

If you set up your system and accepted all of the defaults for Linux, there is a possibility that your disk space was not configured correctly. I don’t particularly care to rewrite the blog post, but I followed the instructions found here. The commands you need to enter can be found in the code block below if you prefer to skip the explanation in the aforementioned link.

df -h #run to verify how much space is being used
resize2fs /dev/mapper/ubuntu--vg-ubuntu--lv #extends your root drive
df -h #run one more time to confirm that root drive has been increased

While the pelias documentation does provide for an installation from scratch, I don’t recommend that as its a lot more work. These instructions will use the docker image. An important note here. Depending on your organization, there is a possibility that you already have docker available. If you do, there’s no need to follow the next series of instructions. Instead, contact your administrator to get docker installed on your Linux server. If you don’t have a license, docker desktop is free if you meet one of the following conditions: you’re an individual, non-commercial open source developer, student, educator, or small businesses with less than 250 employees AND less than $10 million in revenue. All government orgs, however, must purchase a subscription (at the time of this writing, I was an employee of North Metro Fire Rescue District). For additional information, please review Docker’s FAQ.

What does that mean if you’re not one of the above? You’ll have to install docker engine and docker compose separately using the open source installation. Docker tries to steer you away from this option, but when you don’t have the budget you don’t have the budget.

Official Documentation is here for your reference

If this is a fresh install of Linux, there should be no packages to uninstall. If you’re not sure, the following commands should be use. It won’t hurt your system so there’s no worrying about accidentally deleting something.

sudo apt-get purge docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin docker-ce-rootless-extras
sudo rm -rf /var/lib/docker
sudo rm -rf /var/lib/containerd
for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do sudo apt-get remove $pkg; done

Set up the repository.

sudo apt-get update
sudo apt-get install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo \
  "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

Install Docker Engine

sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Verify the install was successful by running the hello-world image.

sudo docker run hello-world

There is a possibility that you may run into an issue with docker compose. As of the writing of this blog, Pelias uses Docker v1 in their image. I believe they’re planning to update it to v2, but they haven’t as of this writing. I may fork it and change it in the future, but haven’t gotten around to it as of yet while I wait for the official team to do so. However, the installation above most likely installed docker v2. You can confirm this by running the command docker compose version. It will return the version you have following the syntax N.O.P where N is the major version number. If you have v2 installed, you will need to make some edits. When you clone the docker image, open the pelias install script located in the file ./pelias. Everywhere that the command docker-compose is found, you will need to change it to docker compose. For our purposes, the difference between Docker Compose v1 and v2 is that the command docker-compose becomes docker compose.

While pelias does requite git, git is usually installed be default. You can confirm this by running the command git --version. If git isn’t installed for some reason, you can install it by using the command sudo apt install git.

The featured image is by Gabriel Heinzer on Unsplash.

Leave a comment