Elton  Bogan

Elton Bogan


Complete Guide To XGBoost With Implementation In R

In recent times, ensemble techniques have become popular among data scientists and enthusiasts. Until now Random Forest and Gradient Boosting algorithms were winning the data science competitions and hackathons, over the period of the last few years XGBoost has been performing better than other algorithms on problems involving structured data. Apart from its performance, XGBoost is also recognized for its speed, accuracy and scale. XGBoost is developed on the framework of Gradient Boosting.

Just like other boosting algorithms XGBoost uses decision trees for its ensemble model. Each tree is a weak learner. The algorithm goes on by sequentially building more decision trees, each one correcting the error of the previous tree until a stopping condition is reached.

In this article, we will discuss the implementation of XGBoost Algorithm in R.


What is GEEK

Buddha Community

Complete Guide To XGBoost With Implementation In R
Marcus  Flatley

Marcus Flatley


Getting Started with R Markdown — Guide and Cheatsheet

In this blog post, we’ll look at how to use R Markdown. By the end, you’ll have the skills you need to produce a document or presentation using R Mardown, from scratch!

We’ll show you how to convert the default R Markdown document into a useful reference guide of your own. We encourage you to follow along by building out your own R Markdown guide, but if you prefer to just read along, that works, too!

R Markdown is an open-source tool for producing reproducible reports in R. It enables you to keep all of your code, results, plots, and writing in one place. R Markdown is particularly useful when you are producing a document for an audience that is interested in the results from your analysis, but not your code.

R Markdown is powerful because it can be used for data analysis and data science, collaborating with others, and communicating results to decision makers. With R Markdown, you have the option to export your work to numerous formats including PDF, Microsoft Word, a slideshow, or an HTML document for use in a website.

r markdown tips, tricks, and shortcuts

Turn your data analysis into pretty documents with R Markdown.

We’ll use the RStudio integrated development environment (IDE) to produce our R Markdown reference guide. If you’d like to learn more about RStudio, check out our list of 23 awesome RStudio tips and tricks!

Here at Dataquest, we love using R Markdown for coding in R and authoring content. In fact, we wrote this blog post in R Markdown! Also, learners on the Dataquest platform use R Markdown for completing their R projects.

We included fully-reproducible code examples in this blog post. When you’ve mastered the content in this post, check out our other blog post on R Markdown tips, tricks, and shortcuts.

Okay, let’s get started with building our very own R Markdown reference document!

R Markdown Guide and Cheatsheet: Quick Navigation

1. Install R Markdown

R Markdown is a free, open source tool that is installed like any other R package. Use the following command to install R Markdown:


Now that R Markdown is installed, open a new R Markdown file in RStudio by navigating to File > New File > R Markdown…. R Markdown files have the file extension “.Rmd”.

2. Default Output Format

When you open a new R Markdown file in RStudio, a pop-up window appears that prompts you to select output format to use for the document.

New Document

The default output format is HTML. With HTML, you can easily view it in a web browser.

We recommend selecting the default HTML setting for now — it can save you time! Why? Because compiling an HTML document is generally faster than generating a PDF or other format. When you near a finished product, you change the output to the format of your choosing and then make the final touches.

One final thing to note is that the title you give your document in the pop-up above is not the file name! Navigate to File > Save As.. to name, and save, the document.

#data science tutorials #beginner #r #r markdown #r tutorial #r tutorials #rstats #rstudio #tutorial #tutorials

August  Larson

August Larson


R vs Python: What Should Beginners Learn?

Let go of any doubts or confusion, make the right choice and then focus and thrive as a data scientist.

I currently lead a research group with data scientists who use both R and Python. I have been in this field for over 14 years. I have witnessed the growth of both languages over the years and there is now a thriving community behind both.

I did not have a straightforward journey and learned many things the hard way. However, you can avoid making the mistakes I made and lead a more focussed, more rewarding journey and reach your goals quicker than others.

Before I dive in, let’s get something out of the way. R and Python are just tools to do the same thing. Data Science. Neither of the tools is inherently better than the other. Both the tools have been evolving over years (and will likely continue to do so).

Therefore, the short answer on whether you should learn Python or R is: it depends.

The longer answer, if you can spare a few minutes, will help you focus on what really matters and avoid the most common mistakes most enthusiastic beginners aspiring to become expert data scientists make.

#r-programming #python #perspective #r vs python: what should beginners learn? #r vs python #r

Angela  Dickens

Angela Dickens


The Complete Guide to Logical Operators in R

Suppose we want to change or compare the results of the comparisons made using relational operators. How would we go about doing that?

R does this using the AND, the OR, and the **NOT **operator.

Logical Operators

  • AND operator &
  • OR operator |
  • NOT operator !

AND Operator “&”

The AND operator takes two logical values and returns TRUE only if both values are TRUE themselves. This means that TRUE & TRUE evaluates to TRUE, but that FALSE & TRUETRUE & FALSE, and FALSE & FALSE evaluates to FALSE.

Only two trues can give us a true with the AND operator.

Only TRUE and TRUE will give us TRUE.

Instead of using logical values, we can use the results of comparisons. Suppose we have a variable x, equal to 12. To check if this variable is greater than 5 but less than 15, we can use x greater than 5 and x less than 15.

x <- 12
x > 5 & x < 15

The first part, x > 5 will evaluate to TRUE since 12 is greater than 5. The second part, x < 15 will also evaluate to TRUE since 12 is also less than 15. So, the result of this expression is TRUE since TRUE & TRUE is TRUE. This makes sense, because 12 lies between 5 and 15.

However, if x were 17, the expression x > 5 & x < 15 would simplify to TRUE & FALSE, which results in the expression being FALSE.

For you to try

Consider the following vector and variable:

linkedin <- c(16, 9, 13, 5, 2, 17, 14)
last <- tail(linkedin, 1)

The linkedin vector represents the number of LinekdIn views you profile has gotten in the last seven days. The last variable represents the last value of the linkedin vector.

Determine whether the last variable is between 15 and 20, excluding 15 but including 20.


# We are looking for the R equivalent of 15 < last <= 20
last > 15 & last <= 20

Image for post

Image for post

The last variable of linkedin is 14, which is not between 15 and 20.

For you to try (2)

Consider the following vectors:

linkedin <- c(16, 9, 13, 5, 2, 17, 14)
facebook <- c(17, 7, 5, 16, 8, 13, 14)

The linkedin vector represents the views on your LinkedIn profile from the past 7 days, and the facebook vector represents the views on your Facebook profile from the past 7 days.

Determine when LinkedIn views exceeded 10 and Facebook views failed to reach 10 for a particular day. Use the linkedin and facebook vectors.


# linkedin exceeds 10 but facebook below 10
linkedin > 10 & facebook < 10

Image for post

Image for post

Only on the third day were the LinkedIn views greater than 10 but the Facebook views less than 10.

For you to try (3)

Consider the following matrix:

views <- matrix(c(linkedin, facebook), nrow = 2, byrow = TRUE)

The linkedin and facebook variable corresponds to the same vectors in the previous for you to try.

The matrix views has the first and second row corresponding to the linkedin and facebook vectors, respectively.

Determine when the views matrix equals to a number between 11 and 14, excluding 11 and including 14.


# When is views between 11 (exclusive) and 14 (inclusive)?
views > 11 & views <= 14

#data-analytics #data-analysis #r #r-programming #data-science

Fredy  Larson

Fredy Larson


A Complete Beginners Guide to Regular Expressions in R

Learn to Match Any Pattern. It is Easier Than You Think.

The regular expression is nothing but a sequence of characters that matches a pattern in a piece of text or a text file. It is used in text mining in a lot of programming languages. The characters of the regular expression are pretty similar in all the languages. But the functions of extracting, locating, detecting, and replacing can be different in different languages.

In this article, I will use R. But you can learn how to use the regular expression from this article even if you wish to use some other language. It may look too complicated when you do not know it. But as I mentioned at the top it is easier than you think it is. I will try to explain it as much as I can. You are welcome to ask me questions in the comment section if you did not understand any part.

Here we will learn by doing. I will start with very basic ideas and slowly move towards more complicated patterns.

I used RStudio for all the exercises in this article.

#artificial-intelligence #data-science #programming #r #r-programming

Noemi  Sanford

Noemi Sanford


Complete Linear Regression in R | Machine Learning in R | R for Beginners

We are going to learn the introduction of machine learning and linear regression in R 4.0 programming. We will start with the introduction of machine learning then we will discuss the introduction of linear regression. I will also discuss types of linear regression and use cases of linear regression. there are two types of linear regression; simple linear regression and multiple linear regression. Use cases of linear regression are in house price prediction, stock price prediction, Twitter popularity prediction. I will thereafter show you how to analyze the Boston housing dataset. We will analyze dataset variables to understand the variable dependency for the linear regression model. I will show you the linear and non-linear regression models. Thereafter, I will show how you can improve the accuracy of a linear regression model.

#machine-learning #r #r-programming #developer