Making mobility easier by working with GTFS and Python
GTFS — A brief Introduction
Mobility is always an iteresting and challenging topic especially for whom lives in concrete jungles. In order to give a hand to people who commute every day it was created GTFS (General Transit Feed Specification) which is consisted of a set of text files provided by a myriad of transportation companies as well as individuals. These files allow us to retrieve much information related to timetable of bus or train lines all of them organized in a pattern very well defined. Thanks to several people who collaborate on enrich this database, it is possible to extract mobility information from different cities around the globe.
In order to exemplify the sort of information you can find in these files, I could mention routes, stops, stop times, shapes to be plotted in a map and so on. These pieces of data are really valuable for those who would like to develop softwares and mobile applications which aim to provide easy ways to retrieve data from buses in different cities around the globe and help people to commute.
Setting goals
This serie of articles outlines some simple ways to work with the data you may find in different bus companies along with python and SQLite. The example we are going to use during the articles leverages data provided by EPTC, Porto Alegre bus company in Brazil, nevertheless, you can follow the same steps for different data sets although some data may change from provider to provider.
Our data set
First of all, you need to download the data set you want to work with, in this case we can go to the follow link to download our sample data:
The sample data is basically a zip file which contains the following files:
- agency.txt
- calendar.txt
- calendar_dates.txt
- fare_attributes.txt
- feed_info.txt
- routes.txt
- shapes.txt
- stop_times.txt
- stops.txt
- trips.txt
You can find further information by accessing this link below from Google where there are valuable pieces of information regarding the content of these files:
https://developers.google.com/transit/gtfs
The goal here is not exactly explain what GTFS files are, however, we aim to work with the data we can get from different data stores and providing services to our users. Furthermore, we are not going to use all these files during articles explanation since some of them are not exactly valuable for what we intend to do, however once you learn how to consume them and creating a readable database, it becomes easier to manage them according to your needs.
Technology and tools to help on our journey
Technology we are going to use here is quite simple: Python and SQLite. Later, we will go deeper for sure by working with Android mobile development and Restful services in order to expose the data, nevertheless, the best way ever is getting started from the scratch.
You can get more information on how to install Python and SQLite in the links below. I like using Dbeaver in order to perform queries and connect to the database, but obviously, you are free to opt for the SQL management software you like best. Moreover, as for IDE (Integrated Development Environment) you can find below Visual Studio Code download link in order to help you on easily handle your python code. Obviously, again, these tools are only some suggestions, feel free to pick out the ones according to your preferences.
Now we have the pre-requir]ements all set, it is time to get started with converting txt files into SQLite database, let’s move on then! Stay tuned to the next chapters.