The power of algorithmic crowds.

The power of algorithmic crowds.

There’s power in numbers, even for machine learning algorithms. It has been shown repeatedly that the best way to arrive at an estimate is to ask lots of people from diverse backgrounds.

Introduction

One of the most fascinating historical examples of the power of crowds can be found within the pages of James Surowiecki’s “The Wisdom of Crowds,” in which a team of engineers, oceanographers, salvage crew members, and mathematicians were asked to make their best estimate on where a particular sunken submarine, the Scorpion, could be found (the Navy did not have the manpower to search the entire area and wanted a more specific guess). Individually, these guessers were wildly inaccurate, but the group’s combined averaged guess was only 220m from the actual position of the sunken submarine!

It has been shown repeatedly that the best way to arrive at an estimate is to ask lots of people from diverse backgrounds — the more diverse the better. How can we apply this sociological concept to machine learning?

Ensemble Learning

Ensemble models are just a conglomerate of models that are averaged to provide a “crowd’s guess.” Just as humans have bias, so too do models carry with them inherent assumptions and bias. Averaging these out across a few models is almost guaranteed to decrease error.

The Data

For this example, I used some credit card fraud data from Kaggle. The first item of business was to pick an evaluation metric. I noticed that there was a huge imbalance of classes (99.8% of the data was marked as normal transaction volume, the other 0.2% were the fraudulent transactions), so the accuracy metric was out of the picture. For this type of problem, it’s better to choose precision, recall, or F1 scores. For simplicity, I chose precision.

data-science credit-card-fraud crowd algorithms machine-learning deep learning

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Most popular Data Science and Machine Learning courses — July 2020

Most popular Data Science and Machine Learning courses — August 2020. This list was last updated in August 2020 — and will be updated regularly so as to keep it relevant

PyTorch for Deep Learning | Data Science | Machine Learning | Python

PyTorch for Deep Learning | Data Science | Machine Learning | Python. PyTorch is a library in Python which provides tools to build deep learning models. What python does for programming PyTorch does for deep learning. Python is a very flexible language for programming and just like python, the PyTorch library provides flexible tools for deep learning.

Data Augmentation in Deep Learning | Data Science | Machine Learning

Data Augmentation is a technique in Deep Learning which helps in adding value to our base dataset by adding the gathered information from various sources to improve the quality of data of an organisation.

Difference between Machine Learning, Data Science, AI, Deep Learning, and Statistics

In this article, I clarify the various roles of the data scientist, and how data science compares and overlaps with related fields such as machine learning, deep learning, AI, statistics, IoT, operations research, and applied mathematics.

PyTorch for Deep Learning | Data Science | Machine Learning | Python

PyTorch is a library in Python which provides tools to build deep learning models. What python does for programming PyTorch does for deep learning.