The by() function in R is an easy function that allows us to group data within a data set, and perform mathematical functions on it.

It takes a vector, a string, a matrix, or a data frame as input and computes that data based on the mentioned functions.

Let’s start with the syntax

The by() function takes the data as input and computes that based on a given function.

by``(x,indices,FUN)

Where,

  • X = The input data frame.
  • Indices = It is the list of variables or the factors.
  • FUN = The function which needs to be applied for the variables/factors.

A Simple Example of by() function in R

In this section, we are going to try out a simple example. For this purpose, we are using the ‘iris’ dataset. The reason to use this dataset is that it contains categorical data with respect to the numerical value.

Let’s import the dataset by the following code.

1
2
3
4
5
#importing data and assigning to variable df
df<-iris

#computes the mean for species categories in terms of petal.width 
by(df$Petal.Width,list(df$Species),mean)
Output = 
---------------------------------------------------------
: setosa
[1] 0.246
---------------------------------------------------------- 
: versicolor
[1] 1.326
---------------------------------------------------------- 
: virginica
[1] 2.026
----------------------------------------------------------

In the above section, you can see that the by function returns the mean of the species category by grouping them with petal.width. Similarly, you can pass any function to by() and it will return the output based on specified variables.

1
2
3
df<-iris

by(df$Petal.Width,list(df$Species),median)
Output =
---------------------------------------------------------
: setosa
[1] 0.2
---------------------------------------------------------- 
: versicolor
[1] 1.3
---------------------------------------------------------- 
: virginica
[1] 2
----------------------------------------------------------

In the above output, I have passed the median function as input to by and the by() function returns the computed values. i.e. mean values of species categories in terms of petal.width.

#r programming #function

Basics of the by() function in R
1.25 GEEK