Have you ever wondered how NASA zeroes in on certain planets to study them extensively ignoring thousands of others or perhaps how robot-assisted surgeries are becoming increasingly common nowadays? Or maybe you wonder about how Youtube knows exactly what each one of us wants to watch next and how Amazon recommends products which seems to suit our tastes. These seemingly complex tasks are being perfected in today’s world by analysing, you guessed it, huge volumes of data.

The role of data in human life has always been pretty significant even more so now than ever. So much so that chances are no matter whichever industry you work in, or whatever is your field of interest, you might have already heard about how data is transforming our world today. But what exactly is data? How did it all begin and transform into the present form as we know it? Let’s take a closure look.

Well to begin, let’s look at what the Oxford Dictionary has to say about ‘data’ (plural for ‘datum’):

/ˈdeɪtə/, [uncountable, plural] facts or information, especially when examined and used to find out things or to make decisions.

That I guess is pretty straight forward or is it? Data is a collection of facts, such as numbers, words, measurements or just descriptions of things collected through observation. Now, the terms ‘data’ and ‘information’ are often used interchangeably. However, the popular opinion is that data can be referred to as information only when it is arranged in an organized form, when it is viewed in context.

Surely, to be able to define data is not sufficient. To discuss its impact on our lives, we have to address a few fundamental questions. For example, how can we find data? In almost every sphere of our lives, data first has to be measured, collected or reported. Then we need to refine or clean it in order for us to address our specific use cases. Today we have a very detailed architecture which enables us to perform these steps and much more. Now the next obvious question might be — why should we utilize data? Well to answer that simply, careful study and analysis of data helps us answer a lot of questions which we may have otherwise missed. Insights driven by data can enable us to make smart and efficient decisions. In a way, it better helps us learn from our history. And talking about history lets quickly look back at the past to get a general sense of where it all began.

The earliest forms of data were in the form of tally or tick marks which were used in order to track or record inventories such as food or cattle for ancient civilizations. That is what enabled these people to, for example, make the first transactions possible, to account for agricultural land each farmer owned etc. Interesting to note here, these data can be generally thought of as numerical data. That can be validated from the fact that primitive mathematics also have the same origins in the Mesopotamian civilizations. From there, instruments like the abacus were invented to help with the calculations of such records. Then, other data related to astronomical studies and time-keeping resulted in scientific discoveries.

Eventually, as more forms of data were discovered, the need for tools to collect, analyze, and store it also quickly emerged.

First, let us look at what all broad categories of data we have at our disposal purely from a statistical point of view. Later we will elaborate on the forms that we generally encounter on a day to day basis.

Fundamentally we can distinguish data as being either quantitative or qualitative.

  • When we speak of Quantitative data, we refer to data that can be measured numerically such as distance, duration, length, revenue, speed etc. It is fairly straight forward to understand that numerical computations can be performed on these data. Quantitative data is also known as Numerical data.

  • Qualitative data, on the other hand, are non-numerical facts or observations which represents some form of attributes or characteristics. They depict descriptions that may be observed but cannot be computed or calculated. For example, data on attributes such as intelligence, honesty, and creativity collected using the students of a class would be classified as qualitative. They are more exploratory than conclusive in nature. They are often referred to as Categorical data.

Both quantitative and qualitative data can be further classified. Let’s quickly go through each for better clarity.

Numerical or quantitative data can be of two types — Discrete or Continuous.

  • Discrete data can take only certain specific values rather than a range of values. These are whole numbers and can’t be subdivided into smaller and smaller parts. A usual way to represent them is by using bar charts.
  • Continuous data can be broken into smaller and smaller units that can take values between a certain range with the highest and lowest values. The difference between the highest and lowest value is called the range of data. Continuous data can be tabulated in what is called a frequency distribution. They can be graphically represented using histograms.

Categorical or Qualitative Data are also of two types — Ordinal and Nominal.

  • **In ordinal data, the variables have natural, ordered categories. **Ordinal data points are fundamentally ranked. Often when filling out a questionnaire, we generally encounter a scale of rating from bad, satisfactory, good or excellent. This is a perfect example of ordinal data.
  • Nominal values represent discrete units and are used to label variables, that has no quantitative value. They differ from ordinal data due to the absence of ranking or order. A person can either be a citizen of a country or not. One cannot have “more citizenship” than another person. Therefore it is impossible to order citizenship according to any sort of mathematical logic.

#data-science #types-of-data #analysis #data #data-classification #data analysis

You’ve heard about ‘data’, now get to know it
1.25 GEEK