Most of us wonder that is there life outside of our planet? It seemed, for our ancestors, we are alone in the first place, but when we examined the stars, we realized it is a pretty big universe. Then we started to ask if extraterrestrial civilizations exist, where are they? We call this question The Fermi paradox. It states the contradiction between the lack of evidence for extraterrestrial civilizations and various high estimates for their probability.

But, as Carl Sagan stated: “Absence of evidence is not evidence of absence.” So we are seeking to find the question of where the aliens are, or they exist.

One of the major work areas in exoplanet researches is to find this question’s answer. Finding possibly habitable planets in different stellar systems is one of the main objectives of exoplanet researches. We have found 60 habitable exoplanets so far, and we continue to look for more of them.

So, I decided to develop a machine learning project to predict these habitable planets. By going into this project, I extracted two datasets: Nasa exoplanet archive and PHL’s habitable exoplanet catalog.

Data Knowledge

So, there are two datasets to handle: Nasa exoplanet archive and PL data, which contains habitability situations. The dataset that I took from Nasa is my central dataset because it has more features of stars and planets such as; the planet radius, stellar temperature, orbital period, and so on. And I need PHL data to use the habitability feature. The below image pictures distribution of the target feature, which means habitability. As you can see, I have an imbalanced dataset, I will deal with it in the following sections.

#exoplanets #astronomy #space #machine-learning #data-science

Detecting Habitability of Exoplanets with Machine Learning
2.05 GEEK