In this comprehensive guide, I’ll try to explore the different gears and pinions that makes a machine learning model tick. If you try to google the definition of “machine learning”, most of the sources will show the below statement:

“Machine Learning is the field of study that gives computers the capability to learn without being explicitly programmed”

I like to think of machine learning as raising a newly born baby. At first, it’s pretty innocent of the ways of the world, and it doesn’t know what to do, and it requires help at every step. But as it slowly acquires data and knowledge, it evolves and learns to make decisions of its own.

In order to provide machines “the ability to learn”, we need to jump through several hoops, or as I’d like to call it “Ten Commandments of Machine Learning”. These are as listed below:

  • Acquiring data
  • Data Cleansing
  • Exploratory Data Analysis
  • Feature Engineering
  • Feature Selection
  • Train/Validation/Test split
  • Baseline model building
  • Hyper-parameters Tuning
  • Model validation
  • Making Predictions

Now let’s walk through these commandments one-by-one, and I hope by the end you’ll have acquired sufficient knowledge to build a machine learning model of your own.


1) Acquiring data

There are plenty of free-and-open-source datasets widely available on Kaggle. I’d strongly recommend to explore the below link to browse the wide variety of machine learning datasets:

Find Open Datasets and Machine Learning Projects | Kaggle

Download Open Datasets on 1000s of Projects + Share Projects on One Platform. Explore Popular Topics Like Government…

www.kaggle.com

On other hand, if you’d like to create your own datasets for building the machine learning model, you can perform web-scraping on multiple websites and data sources. This is out-of-scope for this post. However you can checkout the below link to explore more on web-scraping.

#exploratory-data-analysis #web-scraping #data-cleansing #outlier-detection #machine-learning

Comprehensive Guide to Machine Learning (Part 1 of 3)
1.30 GEEK