How I Built Time Series Data Out of Cross-Sectional Uber Travel Times Data

How I Built Time Series Data Out of Cross-Sectional Uber Travel Times Data

I knew I wanted to do two things in the process of writing my bachelor’s thesis: improve my programming skills and work with time-series data prediction.

I knew I wanted to do two things in the process of writing my bachelor’s thesis: improve my programming skills and work with time-series data prediction.

What I didn’t know, however, is what I wanted to study. However, it had to be something I truly liked and not necessarily connected to my major.

Since I like maps and… things that move, I decided to somehow use data from a website I had recently come across and fallen in love with (probably after playing many hours of SimCity as a teenager) — the Uber Movement website.

It allows you to visualize anonymized data for average travel times from a certain point (or zone) to any other point in that same city.

Image for post

Sample travel times for my home town, São Paulo.

“Great!”, I thought. “Let me just download the travel times and plot some numbers so I can move on to exploratory data analysis (EDA).”

As it turns out, the data I wanted was not so easy to extract.

The Problem

In order to have a more statistically precise outcome, I needed all the data I could get. The more rows of data I had, the greater the predictive power of my models (potentially). The smaller the time increment, the better. Therefore, I needed to get my hands on daily travel times.

The Uber Movement website allows you to download data from any zone to every other zone in the city. However, there’s a catch. Whatever date range you are interested in getting travel times for does not consist of daily data.

That is, if you select to download values from January 2020 to March 2020, you won’t receive 90 values, which is roughly the amount of days in that range. Rather, it spits out a csv file with one single value for the three-month average travel time for each pair of zones.

Image for post

Different formats of travel times data you can download for every given pair of zones.

This meant that I had to compromise on the amount of data points throughout time, to get lots of values for a single point in time.

programming automation data-analysis time-series-analysis uber data science

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Preparing data for time series analysis

TS may look like a simple data object and easy to deal with, but the reality is that for someone new it can be a daunting task just to prepare the dataset before the actual fun stuff can begin.

Data Science Course in Dallas

Become a data analysis expert using the R programming language in this [data science](https://360digitmg.com/usa/data-science-using-python-and-r-programming-in-dallas "data science") certification training in Dallas, TX. You will master data...

Exploratory Data Analysis is a significant part of Data Science

Data science is omnipresent to advanced statistical and machine learning methods. For whatever length of time that there is data to analyse, the need to investigate is obvious.

A Real-World Time Series Data Analysis and Forecasting

Applying the ARIMA model to forecast time series dataThe notion of stationarity of a series is important for applying statistical forecasting models since.

50 Data Science Jobs That Opened Just Last Week

Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments. Our latest survey report suggests that as the overall Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments, data scientists and AI practitioners should be aware of the skills and tools that the broader community is working on. A good grip in these skills will further help data science enthusiasts to get the best jobs that various industries in their data science functions are offering.