What is a histogram?

A histogram is a type of graph commonly used to visualize the univariate distribution of a numeric data. Here the data is displayed in the form of bins which represents the occurrence of datapoints within a range of values. These bins and the distribution thus formed can be used to understand some useful information about the data such as central location, the spread, shape of data etc. It can also be used to find outliers and gaps in data.

A basic histogram for age looks as below.

ggplot histogram

From the above histogram it can be interpreted that most of the people fall within the age range of 50-60 and there seems to be less number of people for the range 70-80 and 90-100 .There is also a gap in the histogram for the range 80-90 which indicates that the data for the age range 80-90 might be missing or not available. So, a histogram as above can be used to visualize useful information about a continuous numeric variable. Let’s see more about these histograms, how to create them and its various customization options below.

Histogram and Bar Charts

Histograms are sometimes confused with bar charts. Although a histogram looks similar to a bar chart, the major difference is that a histogram is only used to plot the frequency of occurrences in a continuous data set that has been divided into classes, called bins. Bar charts, on the other hand, is used to plot categorical data.

#r #ggplot histogram

Data visualization using ggplot histogram
2.20 GEEK