R Loop over unique values in a dataframe column to create another one based on conditions

My dataset consists of scores and total respondents for questions asked in a survey, over a number of fiscal years (FY13, FY14 & FY15) and in different regions.

My objective is to loop through the FY column and identify when each question was asked, for each region. And store this information in a new column.

This is what a reproducible sample looks like -

testdf=data.frame(FY=c("FY13","FY14","FY15","FY14","FY15","FY13","FY14","FY15","FY13","FY15","FY13","FY14","FY15","FY13","FY14","FY15"),
              Region=c(rep("AFRICA",5),rep("ASIA",5),rep("AMERICA",6)),
              QST=c(rep("Q2",3),rep("Q5",2),rep("Q2",3),rep("Q5",2),rep("Q2",3),rep("Q5",3)),
              Very.Satisfied=runif(16,min = 0, max=1),
              Total.Very.Satisfied=floor(runif(16,min=10,max=120)),
              Satisfied=runif(16,min = 0, max=1),
              Total.Satisfied=floor(runif(16,min=10,max=120)),
              Dissatisfied=runif(16,min = 0, max=1),
              Total.Dissatisfied=floor(runif(16,min=10,max=120)),
              Very.Dissatisfied=runif(16,min = 0, max=1),
              Total.Very.Dissatisfied=floor(runif(16,min=10,max=120)))

I start with creating an ID column, by concatenating Region & QST

library(tidyr)
testdf = testdf %>%
unite(ID,c('Region','QST'),sep = "",remove = F)

My Objective

1) For each unique ID, identify whether the given question was asked -

a) Only on one year (either FY13, FY14 or FY15)

b) Over the Past Two Years (FY15 & FY14 only)

c) Over the Past Three Years (FY15 & FY14 & FY13)

d) On FY13 & FY15 Only

My Attempt

For this problem, I tried to create a for loop, and for each unique ID, I first store the unique occurences of each FY the question was asked in a vector v. Then using an IF conditional statement I assign a comment to a newly created column called Tally based on these occurences.

for (i in unique(testdf$ID))
{
v=unique(testdf$FY)

if((‘FY15’ %in% v) & (‘FY14’ %in% v)) {
testdf$Tally==‘Asked Over The Past Two Years’
}
else if((‘FY15’ %in% v) & (‘FY14’ %in% v) & (‘FY13’ %in% v)) {
testdf$Tally==‘Asked Over The Past Three Years’
}
else if((‘FY13’ %in% v) & (‘FY15’ %in% v)) {
testdf$Tally==‘Question Asked in FY13 & FY15 Only’
}
else { testdf$Tally==‘Question Asked Once Only’
}

}

The loop seems to run without throwing an error message, but it doesn’t seem to create the new Tally column.

Any help with this will be greatly appreciated.

#r #loops

4 Likes81.40 GEEK