5 Data Structures to Master in R if you want to be a Data Scientist

5 Data Structures to Master in R if you want to be a Data Scientist

5 Data Structures to Master in R if you want to be a Data Scientist: Learn how to master the basic data types, and advanced data structures, such as factors, lists, and data frames.

To become an R data scientist, you will need to master the basics of this widely used open source language, including factors, lists, and data frames. After mastering these data structures, you’ll be ready to undertake your first very own data analysis!

The five data structures are:

  1. Vectors
  2. Matrices
  3. Factors
  4. Data Frames
  5. Lists

Read until the end for a cheat-sheet of the data types.

The Basic Data Types

Before we start with the data structures, it is important to take a look at the basic data types that make up some of the elements in these data structures.

The key types are:

  • numerics — decimal values like 4.5, or interger values like 4.
  • logicals — boolean values (TRUE or FALSE)
  • characters — text (or string) values like "medium" (Note that these are case sensitive)
# Here are some variables being assigned these basic data types
my_numeric <- 9.5
my_logical <- TRUE
my_character <- "Linda"

If you want to get more in-depth with the basics in R, check out this article I wrote which teaches you how to calculate, assign variables, and work with the basic data types. It includes practice problems to work on too!


Vectors are one-dimensional arrays that can store any of the basic data types, including numerics, logicals, and characters.

Creating a Vector

To create a vector, use the combine function [c()](https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/c) with the elements separated by a comma between the parenthesis.

my_vector <- c(elem1, elem2, elem3)
numeric_vector <- c(1, 2, 3)
character_vector <- c("a", "b", "c")

Naming a Vector

Naming a vector gives a name to the vector elements. It can be done using the name function names().

# Without names, its not clear what data is being used
some_vector <- c("Linda", "Data Scientist")
names(some_vector) <- c("Name", "Profession")

# Output
> some_vector
     Name          Profession
   "Linda"   "Data Scientist"

Selecting from a Vector

If we want to select a single element from a vector, we simply put in the index of the element we want to select between square brackets.

# my_vector is the vector we are selecting from
# i is the index of the element

# To select the first element 
# Note that the first element has index 1 not 0 (as in many other programming languages)

To select multiple elements from a vector, indicate which elements should be selected using a vector within the square brackets.

# For example, to select the first and fifth element, us c(1,5)

# For example, to select a range, we can abbreviate c(2,3,4) to 2:4

We can also use the names of the elements instead of their numeric position.


If you want to get more in-depth with vectors, check out this article I wrote on how to create, name, select, and compare vectors. By the end of it, you’ll learn how to analyze gaming results using vectors!


matrix is a collection of elements of the same data type (numeric, character, or logical) arranged into a fixed number of rows and columns. A two-dimensional matrix is one that works only with rows and columns.

Creating a Matrix

The[**matrix()**](https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/matrix) function creates a matrix. There are three important arguments to this function:

  1. **vector** — This is the collection of elements that will be arranged into the matrix rows and columns. This argument is optional; if we leave this argument blank, the matrix just won’t be filled in, but it can be filled in later. We can use vectors we already created here.
  2. **byrow**— This indicates whether the matrix is filled row-wise (byrow=TRUE) or column-wise (byrow=FALSE). By default it is set to FALSE.
  3. **nrows**— This indicates the desired number of rows.

r-programming data-science programming big-data data analysis

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

R For Data Science Full Course | Data Science With R Full Course |Data Science Tutorial

Learn the essential concepts in data science and understand the important packages in R for data science. You will look at some of the widely used data science algorithms such as Linear regression, logistic regression, decision trees, random forest, including time-series analysis. Finally, you will get an idea about the Salary structure, Skills, Jobs, and resume of a data scientist.

Role of Big Data in Healthcare - DZone Big Data

In this article, see the role of big data in healthcare and look at the new healthcare dynamics. Big Data is creating a revolution in healthcare, providing better outcomes while eliminating fraud and abuse, which contributes to a large percentage of healthcare costs.

Data Cleaning in R for Data Science

A data scientist/analyst in the making needs to format and clean data before being able to perform any kind of exploratory data analysis.

How To Build A Data Science Career In 2021

In Conversation With Dr Suman Sanyal, NIIT University,he shares his insights on how universities can contribute to this highly promising sector and what aspirants can do to build a successful data science career.

Top Microsoft big data solutions Companies | Best Microsoft big data Developers

An extensively researched list of top microsoft big data analytics and solution with ratings & reviews to help find the best Microsoft big data solutions development companies around the world.