Kaggle’s Titanic Competition in 10 Minutes

However, this model did not perform very well since we did not make good data exploration and preparation to understand the data and structure the model better. In Part-II of the tutorial, we will explore the dataset using Seaborn and Matplotlib. Besides, new concepts will be introduced and applied for a better performing model. Finally, we will increase our ranking in the second submission.

Using Jupyter or Google Colab Notebook

For your programming environment, you may choose one of these two options: Jupyter Notebook and Google Colab Notebook:

Jupyter Notebook

As mentioned in Part-I, you need to install Python on your system to run any Python code. Also, you need to install libraries such as Numpy, Pandas, Matplotlib, Seaborn. Also, you need an IDE (text editor) to write your code. You may use your choice of IDE, of course. However, I strongly recommend installing Jupyter Notebook with Anaconda Distribution. Jupyter Notebook utilizes iPython, which provides an interactive shell, which provides a lot of convenience for testing your code. So, you should definitely check it if you are not already using it.

Google Colab Notebook

Google Colab is built on top of the Jupyter Notebook and gives you cloud computing capabilities. Instead of completing all the steps above, you can create a Google Colab notebook, which comes with the libraries pre-installed. So, it is much more streamlined. I recommend Google Colab over Jupyter, but in the end, it is up to you.

Exploring Our Data

To be able to create a good model, firstly, we need to explore our data. Seaborn, a statistical data visualization library, comes in pretty handy. First, let’s remember how our dataset looks like:

Image for post

#data-science #data-visualization #machine-learning #kaggle #artificial-intelligence

What is GEEK

Buddha Community

Kaggle’s Titanic Competition in 10 Minutes

Kaggle Beginner Competitions Can Be Cheated

The purpose of this article is to warn new kagglers before they waste their time on trying to get an impossible score. Some kagglers got maximum accuracy with one click. Before we discuss how they did it and why — let’s introduce shortly Kaggle scoring model to understand why would even somebody try to cheat.

Kaggle Progression System

Kaggle is a portal where data scientists, machine learning experts, and analytics can challenge their skills, share knowledge, and take part in various competitions. And it is open to every level of experience — from complete newbie to grandmaster. You can use open datasets to broaden your knowledge, gain kudos/swag, and even win money.

Image for post

Some of the available competitions. (Image by author)

Winning competitons, taking part in discusions, and sharing your ideas result in medals. Medals are presented on your profile along with all your achievements.

#data-science #beginner #kaggle-competition #competition #kaggle #data science

Vern  Greenholt

Vern Greenholt

1598236620

Kaggle Beginner Competitions Can Be Cheated

The purpose of this article is to warn new kagglers before they waste their time on trying to get an impossible score. Some kagglers got maximum accuracy with one click. Before we discuss how they did it and why — let’s introduce shortly Kaggle scoring model to understand why would even somebody try to cheat.

Kaggle Progression System

Kaggle is a portal where data scientists, machine learning experts, and analytics can challenge their skills, share knowledge, and take part in various competitions. And it is open to every level of experience — from complete newbie to grandmaster. You can use open datasets to broaden your knowledge, gain kudos/swag, and even win money.

Image for post

Some of the available competitions. (Image by author)

Winning competitons, taking part in discusions, and sharing your ideas result in medals. Medals are presented on your profile along with all your achievements.

Image for post

#data-science #beginner #kaggle-competition #competition #kaggle #data science

Kaggle’s Titanic Competition in 10 Minutes

However, this model did not perform very well since we did not make good data exploration and preparation to understand the data and structure the model better. In Part-II of the tutorial, we will explore the dataset using Seaborn and Matplotlib. Besides, new concepts will be introduced and applied for a better performing model. Finally, we will increase our ranking in the second submission.

Using Jupyter or Google Colab Notebook

For your programming environment, you may choose one of these two options: Jupyter Notebook and Google Colab Notebook:

Jupyter Notebook

As mentioned in Part-I, you need to install Python on your system to run any Python code. Also, you need to install libraries such as Numpy, Pandas, Matplotlib, Seaborn. Also, you need an IDE (text editor) to write your code. You may use your choice of IDE, of course. However, I strongly recommend installing Jupyter Notebook with Anaconda Distribution. Jupyter Notebook utilizes iPython, which provides an interactive shell, which provides a lot of convenience for testing your code. So, you should definitely check it if you are not already using it.

Google Colab Notebook

Google Colab is built on top of the Jupyter Notebook and gives you cloud computing capabilities. Instead of completing all the steps above, you can create a Google Colab notebook, which comes with the libraries pre-installed. So, it is much more streamlined. I recommend Google Colab over Jupyter, but in the end, it is up to you.

Exploring Our Data

To be able to create a good model, firstly, we need to explore our data. Seaborn, a statistical data visualization library, comes in pretty handy. First, let’s remember how our dataset looks like:

Image for post

#data-science #data-visualization #machine-learning #kaggle #artificial-intelligence

Riiid Announces $100,000 Kaggle Competition Using EdNet- World’s Largest Education Dataset

Riiid Labs has announced the launch of the first-ever global Artificial Intelligence Education (AIEd) Challenge, created to accelerate innovation in education by building a better and more equitable learning model for students around the world.

Read more: https://analyticsindiamag.com/riiid-announces-100000-kaggle-competition-using-ednet-worlds-largest-education-dataset/

#edtech #kaggle #competition #artificial-intelligence #dataset #machine-learning

Kaggle’s Titanic Competition in 10 Minutes

_Since you are reading this article, I am sure that we share similar interests and are/will be in similar industries. So let’s connect via Linkedin! Please do not hesitate to send a contact request! _Orhan G. Yalçın — Linkedin

Image for post

Photo by Markus Spiske on Unsplash

If you are interested in machine learning, you have probably heard of Kaggle. Kaggle is a platform where you can learn a lot about machine learning with Python and R, do data science projects, and (this is the most fun part) join machine learning competitions. Competitions are changed and updated over time. Currently, “Titanic: Machine Learning from Disaster” is “the beginner’s competition” on the platform. In this post, we will create a ready-to-upload submission file with less than 20 lines of Python code. To be able to this, we will use Pandas and Scikit-Learn libraries.

Titanic RMS and the Infamous Accident

RMS Titanic was the largest ship afloat when it entered service, and it sank after colliding with an iceberg during its first voyage to the United States on 15 April 1912. There were 2,224 passengers and crew aboard during the voyage, and unfortunately, 1,502 of them died. It was one of the deadliest commercial peacetime maritime disasters in the 20th century.

Image for post

Figure 1. A Greyscale Photo of Titanic RMS on Wikipedia

One of the main reasons for such a high number of casualties was the lack of sufficient lifeboats for the passengers and the crew. Although luck played a part in surviving the accident, some people such as women, children, and the upper-class passengers were more likely to survive than the rest. We will calculate this likelihood and effect of having particular features on the likelihood of surviving. And we will accomplish this in less than 20 lines of code and have a file ready for submission. … Let’s Get Started!

#programming #technology #data-science #artificial-intelligence #machine-learning