Spotify User Analytics
Containerized end-to-end analytics of Spotify data using Python, dbt, Postgres, and Metabase.
In this project, we will be analyzing our listening history, top tracks & artists, and genres from Spotify. Here are the tools that we will be using:
- Python - Scraping data from Spotify API endpoints and saving files to CSV
- Postgres - Our database wherein data will be stored into and queried from
- dbt (Data Build Tool) - Data modeling tool to transform our data in staging to fact, dimension tables, and views
- Metabase - Dashboarding tool to analyze our data
- Docker - Containerizing our applications i.e. Postgres, dbt, and Metabase
Project Files
- app
- main.py - Our main ETL script that fetches data from the Spotify API endpoints and saves them to CSV
- util.py - Utility helper file that contains a custom class SpotifyUtil
- config_template.py - This is where we will store our credentials
- dbt
- models - Contains the sql scripts and schema.yml files that will be used when we run our transformations
- dbt_entrypoint.sh - Script that will server as our entrypoint when running the
dbt
container
- Dockerfile - Contains the commands to create the custom Docker image
- dbt_project.yml - YAML file to configure dbt
- packages.yml - YAML file for test dependencies
- profiles.yml - YAML file to configure connection of
dbt
to postgres
- metabase
- metabase.db - Metadata database of Metabase for the dashboard
- docker-compose.yml - YAML file to orchestrate Docker containers composition
#data analysis #data-science