Search⌘ K
AI Features

Geolocation

Explore how to normalize and query GeoNames geolocation data effectively in PostgreSQL. Understand handling incomplete data and leveraging GiST indexes to perform efficient spatial lookups and analysis with real datasets.

How the breakdown works

Before loading the raw data into a normalized version of the table, which will make heavy use of the references we normalized before, we have to study and understand how the breakdown works:

PostgreSQL
select count(*) as all,
count(*) filter(where country_code is null) as no_country,
count(*) filter(where admin1_code is null) as no_region,
count(*) filter(where admin2_code is null) as no_district,
count(*) filter(where feature_class is null) as no_class,
count(*) filter(where feature_code is null) as no_feat
from raw.geonames ;

We have lots of entries without reference to a country and even more without a detailed breakdown (admin1_code and admin2_code are not always part of the data). Moreover, we also have points without any reference feature and class, some of them in the Arctic.

   all    | no_country | no_region | no_district | no_class | no_feat
----------+------------+------
...