Learning Geospatial Data with GeoPandas
I don’t know about you guys but I love maps. When I found out that you can use Python to play around with maps I wanted to learn more about it. After doing some research I found that GeoPandas is one of the most used geospatial libraries for Python so that’s what I dove into. GeoPandas is built upon NumPy and Pandas, so if you are familiar with those GeoPandas is rather easy to pick up and has great documentation to help you out with.
GeoPandas uses the Pandas library to extend the datatypes used by pandas to allow spatial operations on geometric types.The first thing you are going to need to do is install GeoPandas. If you use Anaconda the GeoPandas library is already in there you only need to do pip install to be able to work with it.using pip install.
Next we need to install all of our libraries and import some functions too.
From there I wanted to play around with the NYC boroughs dataset that they have available in GeoPandas. It’s just like reading a CSV file with Pandas. Then we can use Matplotlib to pull up the map as if we were just graphing it
and we get this…
Quickly looking at the DataFrame, the geometry column has some funky numbers for the coordinates but we’ll come back to that.
Now time do to get some outside data to try to lay on top of this map that GeoPandas gave us. From data.nyc.gov, they provide a DataFrame for all of the subway locations with their longitude and latitude. It was in a CSV file so first thing I had to do was turn it into a DataFrame.
Then we had to turn the DataFrame into a GeoDataFrame. It is similar to turning data into a normal DataFrame but we use GeoDataFrame() and need to create a new column with coordinates that GeoPandas can read and change the data type to epsg:4326.
Now if we try to plot the subway locations on the map it won’t work because the scale for the coordinations are different. To fix that we can change the coordinates to epsg:4326 on the map of NYC.
Using Matplotlib we are able to plot the two GeoDataFrames on top of each other using subplots. Some things I want to try to do is make the points bigger at stations that have more stops.
To wrap everything up I want to go over what we did in this blog. First we learned what GeoPandas is and how to import a map from GeoPandas. Next we were able to convert coordinates to different types so they would match for when we plotted them. We figured out how to create a GeoDataFrame from a Pandas DataFrame and finally how to plot the two GeoDataFrames together. That’s all I have for now, I hope you are looking forward to this journey with me to map-a-palooza.
Bibliograph