These series of tutorials on Data Science engineering will try to compare how different concepts in the discipline can be implemented in the two dominant ecosystems nowadays: R and Python.

We will do this from a neutral point of view. Our opinion is that each environment has good and bad things, and any data scientist should know how to use both in order to be as prepared as posible for job market or to start personal project.

To get a feeling of what is going on regarding this hot topic, we refer the reader to DataCamp's Data Science War infographic. Their infographic explores what the strengths of R are over Python and vice versa, and aims to provide a basic comparison between these two programming languages from a data science and statistics perspective.

Far from being a repetition from the previous, our series of tutorials will go hands-on into how to actually perform different data science taks such as working with data frames, doing aggregations, or creating different statistical models such in the areas of supervised and unsupervised learning.

We will use real-world datasets, and we will build some real data products. This will help us to quickly transfer what we learn here to actual data analysis situations.

If your are interested in Big Data products, then you might find interesting our series of tutorials on using Apache Spark and Python or using R on Apache Spark (SparkR).


This is a growing list of tutorials explaining concepts and applications in Python and R.

Introduction to Data Frames

An introduction to the basic data structure and how to use it in Python/Pandas and R.

Exploratory Data Analysis

About this important task in any data science engineering project.

Dimensionality Reduction and Clustering

About using Principal Component Analysis and k-means Clustering to better represent and understand our data.

Text Mining and Sentiment Classification

How to use text mining techniques to analyse the positive or non-positive sentiment of text documents using just linear methods.


These are some of the applications we have built using the concepts explained in the tutorials.

A web-based Sentiment Classifier using R and Shiny

How to build a web applications where we can upload text documents to be sentiment-analysed using the R-based framework Shiny.

Building Data Products with Python

Using a wine reviews and recommendations website as a leitmotif, this series of tutorials, with its own separate repository tagged by lessons, digs into how to use Python technologies such as Django, Pandas, or Scikit-learn, in order to build data products.

Red Wine Quality Data analysis with R

Using R and ggplot2, we perform Exploratory Data Analysis of this reference dataset about wine quality.

Information Retrieval algorithms with Python

Where we show our own implementation of a couple of Information Retrieval algorithms: vector space model, and tf-idf.

Kaggle - The Analytics Edge (Spring 2015)

My solution to this Kaggle competition. It was part of the edX MOOC The Analitics Edge. I highly recommend this on-line course. It is one of the most applied I have ever taken about using R for data anlysis and machine learning.


Contributions are welcome! For bug reports or requests please submit an issue.


Feel free to contact me to discuss any issues, questions, or comments.

Download Details:

Author: jadianes
Source Code: 
License: View license

Ray  Patel

Ray Patel


Python Packages in SQL Server – Get Started with SQL Server Machine Learning Services


When installing Machine Learning Services in SQL Server by default few Python Packages are installed. In this article, we will have a look on how to get those installed python package information.

Python Packages

When we choose Python as Machine Learning Service during installation, the following packages are installed in SQL Server,

  • revoscalepy – This Microsoft Python package is used for remote compute contexts, streaming, parallel execution of rx functions for data import and transformation, modeling, visualization, and analysis.
  • microsoftml – This is another Microsoft Python package which adds machine learning algorithms in Python.
  • Anaconda 4.2 – Anaconda is an opensource Python package

#machine learning #sql server #executing python in sql server #machine learning using python #machine learning with sql server #ml in sql server using python #python in sql server ml #python packages #python packages for machine learning services #sql server machine learning services

akshay L

akshay L


Data Science With Python Training | Python Data Science Course | Intellipaat

In this Data Science With Python Training video, you will learn everything about data science and python from basic to advance level. This python data science course video will help you learn various python concepts, AI, and lots of projects, hands-on demo, and lastly top trending data science and python interview questions. This is a must-watch video for everyone who wishes o learn data science and python to make a career in it.

#data science with python #python data science course #python data science #data science with python

Cyrus  Kreiger

Cyrus Kreiger


How I'd Learn Data Science If I Were To Start All Over Again

A couple of days ago I started thinking if I had to start learning machine learning and data science all over again where would I start? The funny thing was that the path that I imagined was completely different from that one that I actually did when I was starting.

I’m aware that we all learn in different ways. Some prefer videos, others are ok with just books and a lot of people need to pay for a course to feel more pressure. And that’s ok, the important thing is to learn and enjoy it.

So, talking from my own perspective and knowing how I learn better I designed this path if I had to start learning Data Science again.

As you will see, my favorite way to learn is going from simple to complex gradually. This means starting with practical examples and then move to more abstract concepts.

#data-science #machine-learning #artificial-intelligence #python-top-story #data-science-top-story #learn-python #learn-data-science

Uriah  Dietrich

Uriah Dietrich


How To Build A Data Science Career In 2021

For this week’s data science career interview, we got in touch with Dr Suman Sanyal, Associate Professor of Computer Science and Engineering at NIIT University. In this interview, Dr Sanyal shares his insights on how universities can contribute to this highly promising sector and what aspirants can do to build a successful data science career.

With industry-linkage, technology and research-driven seamless education, NIIT University has been recognised for addressing the growing demand for data science experts worldwide with its industry-ready courses. The university has recently introduced B.Tech in Data Science course, which aims to deploy data sets models to solve real-world problems. The programme provides industry-academic synergy for the students to establish careers in data science, artificial intelligence and machine learning.

“Students with skills that are aligned to new-age technology will be of huge value. The industry today wants young, ambitious students who have the know-how on how to get things done,” Sanyal said.

#careers # #data science aspirant #data science career #data science career intervie #data science education #data science education marke #data science jobs #niit university data science

5 stages of learning Data Science

With recruiters listing a myriad of “preferred skills” in their job postings, learning Data Science can get quite overwhelming at times. Dividing the journey up into five chapters can provide a clearer picture of what lies ahead.

#machine-learning #learn-data-science #data-science-training #python-for-data-science #data-science-courses