Unsupervised on the Streets of New York

Unsupervised on the Streets of New York

Unsupervised on the Streets of New York. Taking a Deeper Look at Gentrifying Census Tracts with Cluster Classification

Born and raised in New York City, I have seen Harlem as a child in the mid to late 90’s go from an overwhelming minority neighborhood to the gentrified haven of brownstones it is today. There have been academic papers done on the changes in neighborhoods and one of the more recent Ellen & Ding (2016) was the benchmark with which I compared my model.

Taking a data science approach, I wanted to ask a very specific question:

Could a machine learning algorithm detect gentrification?

The most important feature of this project to understand is that this is unsupervised learning. This means that the target variable was not provided for the model. So instead of predicting an already decided outcome, the model would be taking the data and coming to its own conclusions. In the end, I will compare this to the findings from the academic paper I referenced.


The best data set for this type of work was the U.S. Census Tract Data and through some research, I found a study on diversity done at Brown University (Longitudinal Tract Data Base — LTDB). The study collected and compiled census and American Community Survey (ACS) information from 2000, 2010, and 2012.

The types of data were naturally split into two different data sets. General demographic data was built into the census itself. This included age, family size, race, and ethnicity. The survey data included much more detailed information — immigration status, types of employment, and income. All of these features are included for every census tract in the four boroughs. A combination of these two datasets would be the most productive in terms of identifying gentrification. The project continued in the following way.

  1. Data Cleaning & Preprocessing
  2. Data Exploration
  3. Cluster Creation
  4. Qualitative Comparison with Ellen & Ding (2016)
  5. Conclusion & Further Work

machine-learning tableau unsupervised-learning clustering

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

What is Supervised Machine Learning

What is neuron analysis of a machine? Learn machine learning by designing Robotics algorithm. Click here for best machine learning course models with AI

Pros and Cons of Machine Learning Language

AI, Machine learning, as its title defines, is involved as a process to make the machine operate a task automatically to know more join CETPA

What is Machine learning and Why is it Important?

Machine learning is quite an exciting field to study and rightly so. It is all around us in this modern world. From Facebook’s feed to Google Maps for navigation, machine learning finds its application in almost every aspect of our lives. It is quite frightening and interesting to think of how our lives would have been without the use of machine learning. That is why it becomes quite important to understand what is machine learning, its applications and importance.

How To Get Started With Machine Learning With The Right Mindset

You got intrigued by the machine learning world and wanted to get started as soon as possible, read all the articles, watched all the videos, but still isn’t sure about where to start, welcome to the club.

Unsupervised Learning with Scikit-learn, Spotify API, and Tableau Public

In this post, I use an unsupervised learning approach to compare Houston Artists using the Spotify’s Web API and Tableau. We'll walk through the OSEMN framework for this machine learning example. The acronym, OSEMN, stands for Obtain, Scrub, Explore, Model, and iNterpret.