In this post, we'll learn how to build an algorithm to find planets out of the solar system.
A few weeks ago, I wrote an article about using data science in meaningful ways that could help our world become a better place. Now, let’s talk a little bit about other worlds. We can train machines to identify candidates for exoplanets with real datasets provided by NASA and Caltech. How cool is that? Thus, I decided to go on an adventure through the mysteries of the universe. My idea is to create a machine learning model that can predict if an observation is a real candidate for an exoplanet or not. The data was collected by the Kepler mission that revealed thousands of planets out of our Solar System. Unfortunately, the Kepler mission ended in 2018. However, it gave us thousands of observations, so we can train our machines to find planets as well.
And how did the Kepler telescope find planets so far from us if no one can take a clear picture of Pluto from Earth? Well, Kepler was able to find planets by looking for small dips in the brightness of a star when a planet transits in front of it. It is possible to measure the size of the planet based on the depth of the transit and the star’s size.
For this article, I downloaded the most recent dataset from the Caltech website. However, if you feel adventurous, you can use NASA’s API and do some web scraping out of the fountain. I know I want to explore that soon, but for now, let’s keep things a little easier and use NASA’s and Caltech dataset. You can find a similar dataset on Kaggle , the problem is that the dataset was uploaded three years ago and it’s not up to date. The best prediction on Kaggle got a 95% accuracy . To make things more straightforward, I will skip a few exploratory data analyses, but I shared the notebook’s complete version on my Github . You should be familiar with Python and its main packages prior to running the following code. However, if you are not familiar with Python and its main packages, you should be able to reproduce the same results running the notebook in full.
🔵 Intellipaat Data Science with Python course: https://intellipaat.com/python-for-data-science-training/In this Data Science With Python Training video, you...
6 Best Python IDEs for Data Science & Machine Learning  - An IDE (Integrated Development Environment) is used for software development. An IDE may have a compiler, debugger, and all the other requirements needed for software development. IDEs help in consolidating different aspects of a computer program
Introduction An IDE (Integrated Development Environment) is used for software development. An IDE may have a compiler, debugger, and all the other requirements needed for software development. IDEs help in consolidating different aspects of a computer program. 6 Best Python IDEs for Data Science & Machine Learning 
Machine learning algorithms are different from other algorithms. With most algorithms, a programmer begins by entering the algorithm.
A couple of days ago I started thinking if I had to start learning machine learning and data science all over again where would I start?