A frequency distribution is a quantitative data set that shows the number of times categorical variables occur. They are often used to manage large data sets with a range of values. Once data is collected and sorted, frequency distributions can be displayed using visual tools like pie charts, bar graphs, and histograms, or plotted on spreadsheets for easy consumption.

For an example of a real-world frequency distribution, imagine a university registrar trying to assess student behavior to plan course offerings and class limits. The registrar might use a frequency distribution to analyze the number of classes that undergraduates have been registering for in a semester. The registrar could scan registration records and see the number of students taking two classes per semester, three classes per semester, four classes per semester, and so forth. They could then represent these data on a pie chart or bar chart to share with faculty as they prepare to adjust course offerings for future semesters.

There are two main types of frequency distributions used in data analysis: relative frequency distributions and cumulative frequency distributions. Both hinge upon frequency, which in descriptive statistics, is the number of times something occurs within a given data set.

1. Relative frequency distribution: Relative frequency refers to the number of times a particular outcome occurs divided by the total number of outcomes. You can write relative frequency as a fraction, decimal, or percentage. For instance, if a pizza restaurant noted that customers ordered pepperoni on five of twelve pies, pepperoni’s relative frequency would be 5/12. If customers ordered onion on three of twelve pies, onion’s relative frequency would be 3/12 (which reduces to 1/4).

2. Cumulative frequency distribution: Cumulative frequency is the sum of multiple relative frequencies. To continue the pizza example, calculate the cumulative frequency of pepperoni orders plus onion orders. Based on existing data, add 5/12 (for pepperoni) and 3/12 (for onion), resulting in a cumulative relative frequency of 8/12 (which reduces to 2/3). This means that, based on data in the sample size, eight out of every twelve pizzas will have either pepperoni or onion on them.

Frequency distributions can prove useful when representing simple data sets, but they can also apply to higher-level descriptive statistics.

- Statistical hypothesis testing: True to its name, statistical hypothesis testing uses statistical data sets to test predictions made in a hypothesis. When researchers assemble data in a frequency distribution, they can conduct measures of central tendencies (a fancy term for the mean or average). They can also find the standard deviation (variance) between data points and the statistical dispersion (overall variability) of the data set.
- Frequency analysis: Cryptographers (those who study encoded communications and obscure languages) use letter frequency distributions to help translate writing in esoteric script.
- Probability theory: A form of high-level math known as probability theory uses frequency distributions to make observations about data collection. Statisticians look for frequency distributions that show a normal distribution, where data aligns with standard deviations. In the world of statistics, such data is called “platykurtic.” If frequency distributions do not align with normal distribution, statisticians say they exhibit skewness and are described as “leptokurtic.”

In many everyday applications, frequency distributions take on one of the following graphical representations.

1. Pie charts: Pie charts represent the total data set as a circle and each sector of data represents a “wedge” of the pie.

2. Bar graphs: Bar graphs represent data frequency using vertical bars of equal width and equal spacing. Bar graphs represent discrete variables that can be counted.

3. Histograms: Histograms resemble bar graphs, but they do not have spacing between vertical bars. Histograms represent continuous variables that are not counted but rather measured (and thus fit data ranges).

4. Frequency polygons: You can turn a histogram into a frequency polygon by joining the midpoints of each bar in the histogram. These connected midpoints end up looking like a line graph that mimics the contours of the histogram.

You can make a frequency distribution table that shows either grouped frequency distribution or ungrouped frequency distribution. Grouped data means that you combine multiple values into one data point. Ungrouped data means that each data point only equals one single value. For example, try making a grouped frequency distribution table for student exam scores in a hypothetical history class.

1. Collect all data points. First, collect all exam scores from the hypothetical history class. Those hypothetical exam scores are from lowest value to highest value: 68, 72, 74, 79, 81, 85, 85, 89, 92, 95.

2. Group the data. As this is a grouped frequency distribution table, group the test scores into four categories: As, Bs, Cs, and Ds. You have two As, four Bs, three Cs, and one D.

3. Put the data in tabular form. Create a two-column table where the first column will be for the specific letter grade and the second column will be for the number of students who scored that letter grade on their exam.

4. Make additional tables as desired. You could also make a cumulative frequency distribution table from the same data set. For instance, you could have an entry that combines the students who got As and students who got Bs, or an entry that combines the As, the Bs, and the Cs.

5. Convert the data into graphical representations as desired. If you plan to share your data with others, you may choose to convert your frequency distribution table to a pie chart, bar graph, histogram, frequency polygon, or other graphical representation.