Download all the cities, towns, villages
Tap into a single dataset that covers cities, towns, villages, and hamlets worldwide. Every record comes directly from OpenStreetMap, so you benefit from the platform’s community-verified coverage and frequent updates.

Each entry includes the essentials you need for analysis or app development:
- primary and alternate names, plus the full display label
- detailed address hierarchy down to ISO country codes
- population figures when available
- centroid coordinates and bounding boxes
- unique OSM identifiers (
osm_id
,osm_type
) for precise referencing - optional
border
data that links the place to its administrative geometry
Use the identifiers to jump into the Geoapify Place Details API for richer geometry or attribution, or keep everything offline by working straight from the downloadable files.
About This Dataset
We ship the data as country-specific archives using the ISO alpha-2 code. Typical filenames look like us.zip
for the United States, ca.zip
for Canada, or de.zip
for Germany.
Inside each archive you’ll find up to four newline-delimited JSON files, one per settlement tier:
place-city.ndjson
place-town.ndjson
place-village.ndjson
place-hamlet.ndjson
Download only the granularity you need—no planet-scale unzip just to work with a single country.
We use ndjson (newline delimited JSON), which allows you to read the files line by line. This data format is particularly useful for handling large datasets efficiently and seamlessly.
How we compile the dataset
We curate the records directly from OpenStreetMap with two complementary passes:
- Settlements (
place=*
) – we pick up every feature tagged as city, town, village, or hamlet. - Administrative boundaries – we add polygons tagged as boundary=administrative when they reference the same settlement level (through
linked_place
,border_type
, orplace
tags) or clearly represent municipal areas via theiradmin_level
.
This approach gives you both the populated place features and the boundaries that frame them.
Heads-up: When we import boundaries we first try to attach them to the matching populated place. If the boundary exists on its own (no
place=*
partner), we still include it and setinferred_type
to the settlement level it represents—city
,town
,village
, orhamlet
. That flag tells you how to categorise the record even though the original OSM feature is tagged only asboundary=administrative
. Some areas in OSM remain mapped this way, so you may encounter boundary-only rows until local mappers add the corresponding place nodes or ways.
Dataset Schema
Each record in the download follows the same schema so you can switch between countries without remapping columns:
name
,other_names
, anddisplay_name
capture the primary and alternate labels for the place.address
provides the nested administrative hierarchy such as county, state, ISO codes, and country code.population
contains the crowd-sourced population value when it exists in OSM.osm_type
,osm_id
, andtype
identify the original OpenStreetMap feature and its settlement level.location
andbbox
store the centroid coordinates and bounding box in[lon, lat]
order.border
links to the matching administrative geometry (single object or array) andinferred_type
flags inferred settlement levels when only a boundary is available.
The table below summarises every field:
Field | Description |
---|---|
name | The name of the place, e.g., "North Vancouver," "Aachen," "Edmonton." |
other_names | A dictionary of alternate names in various languages and official names where applicable. |
display_name | The full display name, including location details such as regional districts, states, and countries. |
address | Information about the location, including city name, county, state, country, country code, etc. |
population | The population of the place. |
osm_type | The data type in the OpenStreetMap database (e.g., "relation", "node" or "way"). |
osm_id | The unique identifier in the OpenStreetMap database for the place. |
type | The place classification provided by OSM, such as city , town , village , or administrative . |
location | The geographical coordinates of the place as [longitude, latitude] . |
bbox | The bounding box [minLon, minLat, maxLon, maxLat] describing the geographic extent of the place. |
border | Optional matching boundary=administrative feature(s) linked to the place. Depending on the source it can be a single object or an array of objects with the same fields as the top-level record. |
inferred_type | When only an administrative boundary exists, this field indicates the settlement level we inferred for the place. |
Here’s what a single record looks like. It shows the US city "Worcester" with the border
block that mirrors the same fields for the administrative outline:
- "Worcester"
- {} 4 keys▶
- "Worcester, Worcester County, Massachusetts, United States"
- {} 6 keys▶
- 206518
- "node"
- 158851900
- "city"
- [] 2 items▶
- [] 4 items▶
- -71.9618877
- 42.1025621
- -71.6418877
- 42.4225621
- [] 2 items▶
Linking places with administrative boundaries
The border
field lets you work with both the populated place (place=*
) and its matching boundary=administrative
geometry without extra lookups. Each entry in border
mirrors the top-level schema—name, identifiers, coordinates, and bounding box—so you can reuse the same parsing logic. Depending on the data available in OSM, border
can be a single object or an array of objects. When the dataset only has an administrative relation for a settlement, the inferred_type
property shows which settlement level we mapped it to.
Below is a sample of U.S. records that illustrates how population, bounding boxes, and administrative borders are bundled together.
name | state | population | osm_type/id | location | border |
---|---|---|---|---|---|
Wooster | Ohio | 27,232 | node / 154340168 | [-81.9397733, 40.7980976] | relation / 183148 |
Worcester | Massachusetts | 206,518 | node / 158851900 | [-71.8018877, 42.2625621] | relation / 1844209 |
Wrangell | Alaska | 2,127 | node / 150917119 | [-132.3829431, 56.4706022] | relation / 2605235 |
Yakima | Washington | 96,968 | node / 3413361438 | [-120.510842, 46.601557] | relation / 237752 |
Yonkers | New York | 211,569 | node / 158846316 | [-73.8987469, 40.9312099] | relation / 174494 |
Copyright and License
The data contained in this dataset is compiled from the OpenStreetMap (OSM) database and is subject to the copyright terms and conditions of OSM contributors.
OpenStreetMap® is open data, licensed under the Open Data Commons Open Database License (ODbL) by the OpenStreetMap Foundation (OSMF).
You are free to copy, distribute, transmit, and adapt the data, as long as you credit OpenStreetMap and its contributors. If you alter or build upon our data, you may distribute the result only under the same license.
Related Tutorials

Cities, Towns, Villages and Hamlets dataset: How to Use for Data Analysis

JavaScript Heatmap Example: A Step-by-Step Tutorial to Build Heatmaps with MapLibre GL
Frequently Asked Questions
How to get the original OSM object and its geometry?
To retrieve the original OSM object and its geometry, you can make use of the Geoapify Place Details API. This API allows you to access a wealth of information about a specific place, including its geometrical data. You'll need the unique identifiers from our dataset, such as 'osm_id' and 'osm_type,' to make requests to the Geoapify Place Details API. This API enables you to fetch detailed information about a place, including its boundaries, polygons, and additional attributes, directly from the OpenStreetMap database.
Here is a URL examle:
https://api.geoapify.com/v2/place-details?osm_id=62422&osm_type=r&apiKey=YOUR_API_KEY
I noticed duplicate rows in the dataset. What could be the reason for this?
The presence of duplicate rows in the dataset can be attributed to our comprehensive data import process. We have imported all rows from the OpenStreetMap database with specific tags, namely place=city, place=town, place=village, and place=hamlet. Sometimes, duplicates may arise when multiple objects reference the same geographical location. For example, one entry may represent the city boundary, while another describes the city center. Here is an example:
{"name":"Berlin",… ,"population":3769962,"osm_type":"relation","osm_id":62422,"type":"administrative","location":[13.3888599,52.5170365],"bbox":[13.088345,52.3382448,13.7611609,52.6755087]}
{"name":"Berlin",… ,"population":3769962,"osm_type":"node","osm_id":240109189,"type":"city","location":[13.3888599,52.5170365],"bbox":[13.2288599,52.3570365,13.5488599,52.6770365]}
The first row corresponds to OSM relation 62422, and another row corresponds to OSM node 240109189.
This duplication results from how geographic data is organized in OpenStreetMap, with distinct objects representing various aspects of a place. We've included both entries to provide a more comprehensive view of each location within the dataset.
I noticed some cities are in "no-country.zip." Why is this the case?
"No country" cities may indicate that the data in OpenStreetMap lacks complete country information. This could be due to the nature of crowd-sourced data. You can cross-reference with additional sources or use other attributes provided to assign the city to the correct country.
How can I view place names in another language?
The NDJSON files store the native OpenStreetMap labels. To get a localized version, call the Geoapify Place Details API with the record’s osm_id
and osm_type
, and pass the lang
parameter (for example lang=es
for Spanish or lang=ja
for Japanese). The response returns translated display names and address components where available, so you can present the dataset in your preferred language without manual lookups.
How can I get all the cities in a specific U.S. state?
Use address fields to find all places within one particular administrative location. Here is an Python code sample of searching all cities in the California state:
cities_in_california = [city for city in data if city['address']['state'] == 'California']
Can you provide an example of how to read an ndjson file in Python?
Here's the Python example using pandas and the alternative method with json
for reading ndjson files:
Using pandas:
import pandas as pd
# Read ndjson file with newlines
data = pd.read_json('place-city.ndjson', lines=True)
Using json
(alternative method):
import json
data = []
# Read ndjson file without newlines
with open("place-city.ndjson") as f:
for line in f.readlines():
data.append(json.loads(line))
Both methods achieve the same result, allowing you to read ndjson files into a Python data structure for further processing. You can choose the one that best fits your preferences and file format.
Can you provide an example of how to read an ndjson file in JavaScript?
You can use the readline module to read ndjson files line by line in Node.js. Here's an example in JavaScript:
const readline = require('readline');
const fs = require('fs');
const data = [];
const readStream = readline.createInterface({
input: fs.createReadStream('place-village.ndjson')
});
readStream.on('line', (line) => {
data.push(JSON.parse(line));
});
readStream.on('close', () => {
console.log(data); // Process the data as needed
});
What is the source of the population data in this dataset?
The population data in this dataset is sourced directly from OpenStreetMap, where contributors may provide population information for various locations. Keep in mind that this data is crowd-sourced and may not always be up-to-date or comprehensive.
How frequently is this dataset updated?
This dataset's cities, towns, villages, and hamlets data are not typically subject to frequent updates in OpenStreetMap (OSM). These types of geographical data tend to be relatively stable over time. As a result, there is generally no immediate need for regular updates.
However, we are committed to maintaining the accuracy and reliability of this dataset. If you encounter problems or identify inaccuracies in the data, please don't hesitate to contact us. We will promptly address any issues and provide updates as needed to ensure the dataset remains a valuable and dependable resource for your use.