1603965000
However, this model did not perform very well since we did not make good data exploration and preparation to understand the data and structure the model better. In Part-II of the tutorial, we will explore the dataset using Seaborn and Matplotlib. Besides, new concepts will be introduced and applied for a better performing model. Finally, we will increase our ranking in the second submission.
For your programming environment, you may choose one of these two options: Jupyter Notebook and Google Colab Notebook:
As mentioned in Part-I, you need to install Python on your system to run any Python code. Also, you need to install libraries such as Numpy, Pandas, Matplotlib, Seaborn. Also, you need an IDE (text editor) to write your code. You may use your choice of IDE, of course. However, I strongly recommend installing Jupyter Notebook with Anaconda Distribution. Jupyter Notebook utilizes iPython, which provides an interactive shell, which provides a lot of convenience for testing your code. So, you should definitely check it if you are not already using it.
Google Colab is built on top of the Jupyter Notebook and gives you cloud computing capabilities. Instead of completing all the steps above, you can create a Google Colab notebook, which comes with the libraries pre-installed. So, it is much more streamlined. I recommend Google Colab over Jupyter, but in the end, it is up to you.
To be able to create a good model, firstly, we need to explore our data. Seaborn, a statistical data visualization library, comes in pretty handy. First, let’s remember how our dataset looks like:
#data-science #data-visualization #machine-learning #kaggle #artificial-intelligence
1598235840
The purpose of this article is to warn new kagglers before they waste their time on trying to get an impossible score. Some kagglers got maximum accuracy with one click. Before we discuss how they did it and why — let’s introduce shortly Kaggle scoring model to understand why would even somebody try to cheat.
Kaggle is a portal where data scientists, machine learning experts, and analytics can challenge their skills, share knowledge, and take part in various competitions. And it is open to every level of experience — from complete newbie to grandmaster. You can use open datasets to broaden your knowledge, gain kudos/swag, and even win money.
Some of the available competitions. (Image by author)
Winning competitons, taking part in discusions, and sharing your ideas result in medals. Medals are presented on your profile along with all your achievements.
#data-science #beginner #kaggle-competition #competition #kaggle #data science
1598236620
The purpose of this article is to warn new kagglers before they waste their time on trying to get an impossible score. Some kagglers got maximum accuracy with one click. Before we discuss how they did it and why — let’s introduce shortly Kaggle scoring model to understand why would even somebody try to cheat.
Kaggle is a portal where data scientists, machine learning experts, and analytics can challenge their skills, share knowledge, and take part in various competitions. And it is open to every level of experience — from complete newbie to grandmaster. You can use open datasets to broaden your knowledge, gain kudos/swag, and even win money.
Some of the available competitions. (Image by author)
Winning competitons, taking part in discusions, and sharing your ideas result in medals. Medals are presented on your profile along with all your achievements.
#data-science #beginner #kaggle-competition #competition #kaggle #data science
1603965000
However, this model did not perform very well since we did not make good data exploration and preparation to understand the data and structure the model better. In Part-II of the tutorial, we will explore the dataset using Seaborn and Matplotlib. Besides, new concepts will be introduced and applied for a better performing model. Finally, we will increase our ranking in the second submission.
For your programming environment, you may choose one of these two options: Jupyter Notebook and Google Colab Notebook:
As mentioned in Part-I, you need to install Python on your system to run any Python code. Also, you need to install libraries such as Numpy, Pandas, Matplotlib, Seaborn. Also, you need an IDE (text editor) to write your code. You may use your choice of IDE, of course. However, I strongly recommend installing Jupyter Notebook with Anaconda Distribution. Jupyter Notebook utilizes iPython, which provides an interactive shell, which provides a lot of convenience for testing your code. So, you should definitely check it if you are not already using it.
Google Colab is built on top of the Jupyter Notebook and gives you cloud computing capabilities. Instead of completing all the steps above, you can create a Google Colab notebook, which comes with the libraries pre-installed. So, it is much more streamlined. I recommend Google Colab over Jupyter, but in the end, it is up to you.
To be able to create a good model, firstly, we need to explore our data. Seaborn, a statistical data visualization library, comes in pretty handy. First, let’s remember how our dataset looks like:
#data-science #data-visualization #machine-learning #kaggle #artificial-intelligence
1601966603
Riiid Labs has announced the launch of the first-ever global Artificial Intelligence Education (AIEd) Challenge, created to accelerate innovation in education by building a better and more equitable learning model for students around the world.
#edtech #kaggle #competition #artificial-intelligence #dataset #machine-learning
1603270800
_Since you are reading this article, I am sure that we share similar interests and are/will be in similar industries. So let’s connect via Linkedin! Please do not hesitate to send a contact request! _Orhan G. Yalçın — Linkedin
Photo by Markus Spiske on Unsplash
If you are interested in machine learning, you have probably heard of Kaggle. Kaggle is a platform where you can learn a lot about machine learning with Python and R, do data science projects, and (this is the most fun part) join machine learning competitions. Competitions are changed and updated over time. Currently, “Titanic: Machine Learning from Disaster” is “the beginner’s competition” on the platform. In this post, we will create a ready-to-upload submission file with less than 20 lines of Python code. To be able to this, we will use Pandas and Scikit-Learn libraries.
RMS Titanic was the largest ship afloat when it entered service, and it sank after colliding with an iceberg during its first voyage to the United States on 15 April 1912. There were 2,224 passengers and crew aboard during the voyage, and unfortunately, 1,502 of them died. It was one of the deadliest commercial peacetime maritime disasters in the 20th century.
Figure 1. A Greyscale Photo of Titanic RMS on Wikipedia
One of the main reasons for such a high number of casualties was the lack of sufficient lifeboats for the passengers and the crew. Although luck played a part in surviving the accident, some people such as women, children, and the upper-class passengers were more likely to survive than the rest. We will calculate this likelihood and effect of having particular features on the likelihood of surviving. And we will accomplish this in less than 20 lines of code and have a file ready for submission. … Let’s Get Started!
#programming #technology #data-science #artificial-intelligence #machine-learning