Download all the cities, towns, villages

Tap into a single dataset that covers cities, towns, villages, and hamlets worldwide. Every record comes directly from OpenStreetMap, so you benefit from the platform’s community-verified coverage and frequent updates.

Salt Lake City example showing populated place and administrative boundary outlines

Salt Lake City boundary overlay generated from the dataset.

Each entry includes the essentials you need for analysis or app development:

primary and alternate names, plus the full display label
detailed address hierarchy down to ISO country codes
population figures when available
centroid coordinates and bounding boxes
unique OSM identifiers (osm_id, osm_type) for precise referencing
optional border data that links the place to its administrative geometry

Use the identifiers to jump into the Geoapify Place Details API for richer geometry or attribution, or keep everything offline by working straight from the downloadable files.

About This Dataset

We ship the data as country-specific archives using the ISO alpha-2 code. Typical filenames look like us.zip for the United States, ca.zip for Canada, or de.zip for Germany.

Inside each archive you’ll find up to four newline-delimited JSON files, one per settlement tier:

place-city.ndjson
place-town.ndjson
place-village.ndjson
place-hamlet.ndjson

Download only the granularity you need—no planet-scale unzip just to work with a single country.

We use ndjson (newline delimited JSON), which allows you to read the files line by line. This data format is particularly useful for handling large datasets efficiently and seamlessly.

How we compile the dataset

We curate the records directly from OpenStreetMap with two complementary passes:

Settlements (place=*) – we pick up every feature tagged as city, town, village, or hamlet.
Administrative boundaries – we add polygons tagged as boundary=administrative when they reference the same settlement level (through linked_place, border_type, or place tags) or clearly represent municipal areas via their admin_level.

This approach gives you both the populated place features and the boundaries that frame them.

Heads-up: When we import boundaries we first try to attach them to the matching populated place. If the boundary exists on its own (no place=* partner), we still include it and set inferred_type to the settlement level it represents—city, town, village, or hamlet. That flag tells you how to categorise the record even though the original OSM feature is tagged only as boundary=administrative. Some areas in OSM remain mapped this way, so you may encounter boundary-only rows until local mappers add the corresponding place nodes or ways.

Dataset Schema

Each record in the download follows the same schema so you can switch between countries without remapping columns:

name, other_names, and display_name capture the primary and alternate labels for the place.
address provides the nested administrative hierarchy such as county, state, ISO codes, and country code.
population contains the crowd-sourced population value when it exists in OSM.
osm_type, osm_id, and type identify the original OpenStreetMap feature and its settlement level.
location and bbox store the centroid coordinates and bounding box in [lon, lat] order.
border links to the matching administrative geometry (single object or array) and inferred_type flags inferred settlement levels when only a boundary is available.

The table below summarises every field:

Field	Description
name	The name of the place, e.g., "North Vancouver," "Aachen," "Edmonton."
other_names	A dictionary of alternate names in various languages and official names where applicable.
display_name	The full display name, including location details such as regional districts, states, and countries.
address	Information about the location, including city name, county, state, country, country code, etc.
population	The population of the place.
osm_type	The data type in the OpenStreetMap database (e.g., "relation", "node" or "way").
osm_id	The unique identifier in the OpenStreetMap database for the place.
type	The place classification provided by OSM, such as `city`, `town`, `village`, or `administrative`.
location	The geographical coordinates of the place as `[longitude, latitude]`.
bbox	The bounding box `[minLon, minLat, maxLon, maxLat]` describing the geographic extent of the place.
border	Optional matching `boundary=administrative` feature(s) linked to the place. Depending on the source it can be a single object or an array of objects with the same fields as the top-level record.
inferred_type	When only an administrative boundary exists, this field indicates the settlement level we inferred for the place.

Here’s what a single record looks like. It shows the US city "Worcester" with the border block that mirrors the same fields for the administrative outline:

- name:"Worcester"
- ▶
  other_names:{} 4 keys
- display_name:"Worcester, Worcester County, Massachusetts, United States"
- ▶
  address:{} 6 keys
- population:206518
- osm_type:"node"
- osm_id:158851900
- type:"city"
- ▶
  location:[] 2 items
- ▶
  bbox:[] 4 items
  - 0:-71.9618877
  - 1:42.1025621
  - 2:-71.6418877
  - 3:42.4225621
- ▶
  border:[] 2 items

Linking places with administrative boundaries

The border field lets you work with both the populated place (place=*) and its matching boundary=administrative geometry without extra lookups. Each entry in border mirrors the top-level schema—name, identifiers, coordinates, and bounding box—so you can reuse the same parsing logic. Depending on the data available in OSM, border can be a single object or an array of objects. When the dataset only has an administrative relation for a settlement, the inferred_type property shows which settlement level we mapped it to.

Below is a sample of U.S. records that illustrates how population, bounding boxes, and administrative borders are bundled together.

name	state	population	osm_type/id	location	border
Wooster	Ohio	27,232	node / 154340168	[-81.9397733, 40.7980976]	relation / 183148
Worcester	Massachusetts	206,518	node / 158851900	[-71.8018877, 42.2625621]	relation / 1844209
Wrangell	Alaska	2,127	node / 150917119	[-132.3829431, 56.4706022]	relation / 2605235
Yakima	Washington	96,968	node / 3413361438	[-120.510842, 46.601557]	relation / 237752
Yonkers	New York	211,569	node / 158846316	[-73.8987469, 40.9312099]	relation / 174494

Copyright and License

The data contained in this dataset is compiled from the OpenStreetMap (OSM) database and is subject to the copyright terms and conditions of OSM contributors.

OpenStreetMap® is open data, licensed under the Open Data Commons Open Database License (ODbL) by the OpenStreetMap Foundation (OSMF).

You are free to copy, distribute, transmit, and adapt the data, as long as you credit OpenStreetMap and its contributors. If you alter or build upon our data, you may distribute the result only under the same license.

Frequently Asked Questions

How to get the original OSM object and its geometry?

To retrieve the original OSM object and its geometry, you can make use of the Geoapify Place Details API. This API allows you to access a wealth of information about a specific place, including its geometrical data. You'll need the unique identifiers from our dataset, such as 'osm_id' and 'osm_type,' to make requests to the Geoapify Place Details API. This API enables you to fetch detailed information about a place, including its boundaries, polygons, and additional attributes, directly from the OpenStreetMap database.

Here is a URL examle:

https://api.geoapify.com/v2/place-details?osm_id=62422&osm_type=r&apiKey=YOUR_API_KEY

I noticed duplicate rows in the dataset. What could be the reason for this?

The presence of duplicate rows in the dataset can be attributed to our comprehensive data import process. We have imported all rows from the OpenStreetMap database with specific tags, namely place=city, place=town, place=village, and place=hamlet. Sometimes, duplicates may arise when multiple objects reference the same geographical location. For example, one entry may represent the city boundary, while another describes the city center. Here is an example:

{"name":"Berlin",… ,"population":3769962,"osm_type":"relation","osm_id":62422,"type":"administrative","location":[13.3888599,52.5170365],"bbox":[13.088345,52.3382448,13.7611609,52.6755087]}
{"name":"Berlin",… ,"population":3769962,"osm_type":"node","osm_id":240109189,"type":"city","location":[13.3888599,52.5170365],"bbox":[13.2288599,52.3570365,13.5488599,52.6770365]}

The first row corresponds to OSM relation 62422, and another row corresponds to OSM node 240109189.

This duplication results from how geographic data is organized in OpenStreetMap, with distinct objects representing various aspects of a place. We've included both entries to provide a more comprehensive view of each location within the dataset.

I noticed some cities are in "no-country.zip." Why is this the case?

"No country" cities may indicate that the data in OpenStreetMap lacks complete country information. This could be due to the nature of crowd-sourced data. You can cross-reference with additional sources or use other attributes provided to assign the city to the correct country.

How can I view place names in another language?

The NDJSON files store the native OpenStreetMap labels. To get a localized version, call the Geoapify Place Details API with the record’s osm_id and osm_type, and pass the lang parameter (for example lang=es for Spanish or lang=ja for Japanese). The response returns translated display names and address components where available, so you can present the dataset in your preferred language without manual lookups.

How can I get all the cities in a specific U.S. state?

Use address fields to find all places within one particular administrative location. Here is an Python code sample of searching all cities in the California state:

cities_in_california = [city for city in data if city['address']['state'] == 'California']

Can you provide an example of how to read an ndjson file in Python?

Here's the Python example using pandas and the alternative method with json for reading ndjson files:

Using pandas:

import pandas as pd

# Read ndjson file with newlines
data = pd.read_json('place-city.ndjson', lines=True)

Using json (alternative method):

import json

data = []

# Read ndjson file without newlines
with open("place-city.ndjson") as f:
    for line in f.readlines():
        data.append(json.loads(line))

Both methods achieve the same result, allowing you to read ndjson files into a Python data structure for further processing. You can choose the one that best fits your preferences and file format.

Can you provide an example of how to read an ndjson file in JavaScript?

You can use the readline module to read ndjson files line by line in Node.js. Here's an example in JavaScript:


const readline = require('readline');
const fs = require('fs');

const data = [];

const readStream = readline.createInterface({
  input: fs.createReadStream('place-village.ndjson')
});

readStream.on('line', (line) => {
  data.push(JSON.parse(line));
});

readStream.on('close', () => {
  console.log(data); // Process the data as needed
});

What is the source of the population data in this dataset?

The population data in this dataset is sourced directly from OpenStreetMap, where contributors may provide population information for various locations. Keep in mind that this data is crowd-sourced and may not always be up-to-date or comprehensive.

How frequently is this dataset updated?

This dataset's cities, towns, villages, and hamlets data are not typically subject to frequent updates in OpenStreetMap (OSM). These types of geographical data tend to be relatively stable over time. As a result, there is generally no immediate need for regular updates.

However, we are committed to maintaining the accuracy and reliability of this dataset. If you encounter problems or identify inaccuracies in the data, please don't hesitate to contact us. We will promptly address any issues and provide updates as needed to ensure the dataset remains a valuable and dependable resource for your use.