How To Get Open Street Map Data with Python

How To Get Open Street Map Data Using Python When your app needs to know more about the world around us

Have you ever been working on a project where you need some real-world geographical location data to, for example, say, how many highways cross through this particular city or how many restaurants are in this particular locality?

OpenStreetMap is a great open-source map of the world that can give us some insight into these and similar questions. There is a lot of data hidden in this data set, full of useful labels and geographic information.

The OpenStreetData Model

Let’s have a look at how OSM is structured.

We have three basic components in the OSM data model, which are nodes, ways, and relations, that all come with an ID. Many of the elements come with tags that describe specific features represented as key-value pairs.

In simple terms, nodes are points on the maps (in latitude and longitude) as in the next image of a well documented India Gate in Delhi.

Another way is an ordered list of nodes, which could correspond to a street or the outline of a house. Here is an example of NH 24 in India.

The final data element is a relation which is also an ordered list containing either nodes, ways, or even other relations.

It is used to model logical or geographic relationships between objects. This can be used for example for large structures like the Parliament of India which contains multiple polygons to describe the building.

Using the Overpass API

Now we’ll take a look at how to load data from OSM. The Overpass API uses a custom query language to define the queries.

It takes some time getting used to, but luckily there is Overpass Turbo by Martin Raifer which comes in handy to interactively evaluate our queries directly in the browser.

Let’s say you want to query nodes for cafes, then your query looks like this:

node["amenity"="cafe"]({{bbox}}); out;

Where each statement in the query source code ends with a semicolon. This query starts by specifying the component we want to query, which is, in this case, a node.

We are applying a filter by tag on our query which looks for all the nodes where the key-value pair is "amenity"="cafe". There are different options to filter by a tag that can be found in the documentation.

There are a variety of tags to choose from, one common key is amenity which covers various community facilities like a cafe, restaurant, or just a bench. To have an overview of most of the other possible tags in OSM take a look at the OSM map features or taginfo.

Another filter is the bounding box filter where {{bbox}} corresponds to the bounding box in which we want to search and work only in Overpass Turbo.

Otherwise, you can specify a bounding box by (south, west, north, east) in latitude and longitude which can look like:

node["amenity"="pub"]
  (53.2987342,-6.3870259,53.4105416,-6.1148829); 
out;

Which you can try in Overpass Turbo. As we saw before in the OSM data model, there are also ways and relations that might hold the same attribute.

We can get those as well by using a union block statement, which collects all outputs from the sequence of statements inside a pair of parentheses as in:

( node["amenity"="cafe"]({{bbox}});
  way["amenity"="cafe"]({{bbox}});
  relation["amenity"="cafe"]({{bbox}});
);
out;

The next way to filter our queries is by element id. Here is the example for the query node(1); out;, which gives us the prime meridian of the world with longitude close to zero.

Another way to filter queries is by area which can be specified like area["ISO3166-1"="GB"][admin_level=2];, which gives us the area for Great Britain.

We can use this now as a filter for the query by adding (area) to our statement as in:

area["ISO3166-1"="GB"][admin_level=2];
node["place"="city"](area);
out;

This query returns all cities in Great Britain. It is also possible to use a relation or a way as an area. In this case, area IDs need to be derived from an existing OSM way by adding 2400000000 to its OSM ID, or, in case of relation, by adding 3600000000.

Note that not all ways/relations have an area counterpart (i.e. those that are tagged with area=no, and most multipolygons and that don’t have a defined name=* will not be part of areas).

If we apply the relation of Great Britain to the previous example, we’ll then get:

area(3600062149);
node["place"="city"](area);
out;

Finally, we can specify the output of the queried data, which is configured by the out action. Until now, we specified the output as out;, but there are various additional values that can be appended.

The first set of values can control the verbosity or the detail of information of the output, such as ids, skel, body(default value), tags, meta, and count as described in the documentation.

Additionally, we can add modifiers for the geocoded information. geom adds the full geometry to each object. This is important when returning relations or ways that have no coordinates associated and you want to get the coordinates of their nodes and ways.

For example, the query rel["ISO3166-1"="GB"][admin_level=2]; out geom; would otherwise not return any coordinates. The value bb adds only the bounding box to each way and relation, and center adds only the center of the same bounding box.

The sort order can be configured by asc and qt, sorting by object ID or by quadtile index respectively, where the latter is significantly faster. Lastly, by adding an integer value, you can set the maximum number of elements to return.

After combining what we have learned so far, we can finally query the location of all Biergarten in Germany.

area["ISO3166-1"="DE"][admin_level=2];

( node["amenity"="biergarten"](area);
  way["amenity"="biergarten"](area);
  rel["amenity"="biergarten"](area);
);
out center;

Accessing with Python

To access the Overpass API with Python use the overpy package as a wrapper. Here you can see how we can translate the previous example with the overpy package:

import overpy

api = overpy.Overpass()
r = api.query("""
area["ISO3166-1"="DE"][admin_level=2];
(node["amenity"="biergarten"](area);
 way["amenity"="biergarten"](area);
 rel["amenity"="biergarten"](area);
);
out center;
""")

coords  = []
coords += [(float(node.lon), float(node.lat)) 
           for node in r.nodes]
coords += [(float(way.center_lon), float(way.center_lat)) 
           for way in r.ways]
coords += [(float(rel.center_lon), float(rel.center_lat)) 
           for rel in r.relations]

One nice thing about overpy is that it detects the content type (i.e. XML, JSON) from the response. For further information take a look at their documentation.

Thank you for reading !

#Python #Data Science #Machine Learning #Programming

How To Get Open Street Map Data with Python
24.45 GEEK