Unpivot a column of delimited data with R

Unpivot a column of delimited data with R

Afer explaining how to unpivot columns of delimeted data in Power Query and Python, today I’m extending those explanations to R.

Previously, I’ve explained how to take a column of delimited data and extract the individual values into their own rows in Power Query (Excel and Power BI) and in Python (pandas).

Today I am expanding this mini-series by explaining how these data transformations can be achieved in R.

Once again I’ll make use of this social network usage sample data to demo the transformations.

Sample data

Sample data

The objective is to take the above inital data (loaded from a CSV file) and transform it to the following form:

Resulting data Resulting data

As an extra, I will also show you how to visualize the frequency of the social networks in a bar chart, with Plotly.

If you wish to skip the explanations and jump directly to the code, feel free to visit my GitHub repository where I have all the code and sample data.

Split and unpivot data

The main focus of this demo is splitting and unpivoting the delimited data.

Sample data

Sample data

We can see the “Used Social Networks” column can have multiple social networks in each row (maybe it was a multiple choice question in a survey), separated by semicolons (;). This isn’t a suitable format for data analysis, as we can’t count the frequency of each individual social network.

So, the logic for extracting the individual social networks and putting them on their own rows (unpivot) is as follows:

  • Split the values of each row into their own column (e.g. Facebook;Instagram are split into two columns, one for Facebook, another for Instagram)
  • Take those columns with individual options and put them in a single column (unpivot those columns)

Split and unpivot data transformations

Split and unpivot data transformations

(Notice how the data in the “Respondent ID” and “Gender” columns is repeated to make sure the social networks are still respective to their respondent)

programming data-science data-analysis r data-visualization data analysis

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

R For Data Science Full Course | Data Science With R Full Course |Data Science Tutorial

Learn the essential concepts in data science and understand the important packages in R for data science. You will look at some of the widely used data science algorithms such as Linear regression, logistic regression, decision trees, random forest, including time-series analysis. Finally, you will get an idea about the Salary structure, Skills, Jobs, and resume of a data scientist.

Data Cleaning in R for Data Science

A data scientist/analyst in the making needs to format and clean data before being able to perform any kind of exploratory data analysis.

Data Analysis and Visualization of scraped data from IMDb with R

TV Series that Geeks (and not so geeks) love

Data Manipulation In R | Data Manipulation In R With dplyr | R Programming For Beginners

This video on Data Manipulation in R will help you learn how to transform and summarize your data using different packages and functions. You will use the dplyr package to select, filter, arrange, and mutate data. You will use the tidyr library to create tidy data. You will look at functions such as gather, spread, separate, and unite. Let's begin!

Exploratory Data Analysis is a significant part of Data Science

Data science is omnipresent to advanced statistical and machine learning methods. For whatever length of time that there is data to analyse, the need to investigate is obvious.