R language is a supremo in analysing data, which includes data processing, manipulation and analysis. It offers numerous function to do so and subset() function in R is one among them.

In general words, subsetting means, a set of data that is derived or extracted from the base data.

For example, consider the word – ” R-Programming” where the word “program” is the subset of the base word. At the same time, “R-lang” is not a subset of “R-Programming”. Even though R is present, the letters ‘lang’ is not present in the parent or base word.

I hope the above sample will bring you closer to the concept of subsetting the data. Let’s move and explore some benefits of subset() function in R.


Table of Contents[

hide

]

Let’s start with the syntax

**subset(): **The subset function will extract or return the specific part of the input data based on given parameters/conditions.

1

subset``(x,condition,select)

Where:

  • x = The input data file, vector, matrix, and a string.
  • Condition = The input condition which needs to be satisfied by the function.
  • Select = Select the number of columns.

Key benefits of the subset() function in R

The subset() function in R is beneficial due to couple of reasons:

  • The subset is an** in-built R function** and doesn’t require installing additional packages.
  • filter() function in R also does the same job (subsetting data). But the subset() function is way faster than the filter in terms of execution time.
  • Found its importance in terms of **dealing with huge data **set.

The subset() function in R – An Easy Example

In this section, with the help of a simple example, we are going to subset the data.

Loading the dataset:

1

2

#importing dataset

datasets::airquality

Airquality Dataset 1Airquality Dataset

In the above image, you can see the ‘air quality’ dataset, which is available in R by default. Now, let’s apply the subset() function to extract the data present in the Ozone column which are greater than 30.

1

2

#returns values in ozone

subset``(airquality,Ozone > 30)

Subset In ROzone values > 30

In the above image, you can see that all the values present in Ozone column are greater than 30 ( > 30 ). I hope now you got the better understanding of this function.

Let’s move further and explore more about the subset() function in R.


Multiple conditions using subset function

In the above sections, we passed one condition to our function. Now, let’s try passing multiple conditions to the subset function and let’s see how it works.

1

2

#function with multiple conditions

subset``(airquality, Ozone > 30 & Temp >= 40 & Wind >= 5)

Subset Function In R with multiple conditionsSubset function In R with multiple conditions

In the above code, you can observe that we used three parameters in the function. And in the output, you can see that all our conditions were satisfied by the subset() function.

Like this, you can easily pass as many conditions you can and the function will satisfy the valid ones and returns the same as output.

#r programming #function

A Complete Reference to the Subset() function in R
2.75 GEEK