Photo by Paolo Nicolello on Unsplash

Variable is a quantity that may vary from object to object. For example, we measure heights of 50 mango trees in a selected plot and arrange the results in a table. Here, the quantity that vary between objects (trees) is its heights. Height, therefore, is the only variable in this example. The table containing collection of values of our variable is called ‘dataset’ or sample.

Independent vs dependent variables

Let us consider an example. Algal net primary productivity (mass of carbon per unit area per year (g C (m^-2) (yr^-1)) is measured under various temperatures and light intensity settings. In this experiment, there are three variables involved; primary productivity, temperature and light intensity. However, out of three, only one variable (primary productivity) is measured; other intensity of the two variables are controlled in the experimental set-up. Variables whose variation does not dependant on other variables is called independent variables. In this example, both temperature and light intensity are independent variables, as the variation in the values of these variables are not dependant on other variables. Neither temperature nor light intensity are dependent on primary productivity. However, primary productivity is dependent on both temperature and light intensity. Primary productivity in this example is a dependant variable- a variable whose values dependent upon other variables.

To test whether a variable is independent or dependent, a useful tactic is to substitute the suspected variables in this sentence to see whether the statement makes sense or not: (Independent variable) causes a change in (Dependent Variable) and it is not possible that (Dependent Variable) could cause a change in (Independent Variable). For example, let us consider two variables ‘time spent studying’ and ‘test scores’. (Time Spent Studying) causes a change in (Test Score) and it isn’t possible that (Test Score) could cause a change in (Time Spent Studying). We see that “Time Spent Studying” must be the independent variable and “Test Score” must be the dependent variable because the sentence does not make sense the other way around. Note in case ‘time’ (or related concepts such as ‘age’ etc.) is taken as a variable in the experiment, it would always be an independent variable. A more formal procedure to test whether either of the two variables are dependent on the other, test of correlation (also called covariation) can be adopted. For quantitative data, Pearson’s Correlation test can be used, while for categorical data Pearson’s χ2 test of independence can be used. However, correlation does not reveal which is dependent variable and which is independent variable. Beware of the problem of statistical confounding discussed in module 2. Correlation will be discussed in length in a later module.

In scientific experiments, only dependant variables are measured oftentimes. Dependent variables are therefore known as outcome variables, as it determines the outcome of experiments. Values of these outcome variables are in turn dependent on (and determined by) independent variables. As the experimenter as part of the experimental design oftentimes controls values of independent variables, these variables are also known as treatment variables or response variables. A factor is an independent treatment variable whose settings (values) are controlled and varied by the experimenter. The intensity setting of a factor is the level. Levels may be quantitative numbers or, in many cases, simply “present” or “not present” (“0” or “1”). For example, to find the effect of temperature on resistors, resistance was measured before and after placing the resistors in 3 ovens set at different temperatures. Here, dependent outcome variable is resistance and independent treatment variable is temperature. Different temperature settings are “levels”, here 3. In another example, to find effect of temperature and heating-time on resistors, resistance was measured before and after placing it in 3 ovens set at three different temperatures for 3 different periods. In this case, both “Temperature” and “Time” are factors.

Qualitative vs. Quantitative Variables

Variables can also be grouped based on whether the variables can be expressed in numbers or not. Variables that cannot be expressed in numbers (for example, level of happiness, beauty, ethics, love etc.) are called qualitative variables. Some qualitative variables can be grouped into different labels or categories (for example, gender, nationality etc.). Such variables are known as categorical variables or attribute variables. Variables that can be expressed in numbers, such as height, weight, molarity, photon flux density etc. are called quantitative variables.

#data #data-analysis #statistics #variables #data-science #data analysis

Variables — What are they?
1.15 GEEK