Variance is a basic idea in statistics and probability. It measures how spread out a group of data points is in a data series. It shows how much the numbers in a dataset differ from the mean, or average, of that dataset. In simpler terms, variance indicates how far each number in the set is from the mean and from each other number in the set.
Table of Contents
The variance definition in statistics can be written as:
“Variance is the average of the squared differences from the mean.”
The more the data points differ from the mean, the higher the variance. The closer they are to the mean, the lower the variance. It is important in different statistical analyses and serves as the foundation for concepts like standard deviation, hypothesis testing, and inferential statistics.
Understanding the variance formula is critical in solving problems and making statistical analyses. There are two main types of variance formulas:
When you have data for the entire population:
σ² = (1/N) × Σ(xᵢ - μ)²
Where:
σ² is the population variance
xᵢ is each data point
μ is the population mean
N is the number of data points in the population
s² = (1/n - 1) × Σ(xᵢ - x̄)²
Where:
s² is the sample variance
x̄ is the sample mean
n is the number of data points in the sample
This variance formula is used when dealing with a sample rather than the entire population.
Understanding the properties of variance helps in interpreting data in a more meaningful way:
Variance is always non-negative.
Since the differences are squared, negative deviations turn into positive ones.
If all values are the same, the variance is 0.
This means there is no dispersion from the mean.
Units of variance are the square of the units of the data.
For example, if the data is in meters, the variance is in meters².
Variance is affected by scale.
If we multiply each data value by a constant, the variance incrases by the square of that constant.
There is an additive property for independent variables.
Var(X + Y) equals Var(X) plus Var(Y) if X and Y are independent.
Variance is sensitive to outliers.
Large deviations from the mean can significantly impact the variance.
These properties make variance a strong tool for analyzing data dispersion.
A common question arises: what is the connection between variance and standard deviation?
The standard deviation is the square root of the variance. Variance measures spread in squared units. Standard deviation expresses it in the same units as the original data, making it easier to understand in many real-life situations.
While variance provides deeper statistical insight, the formulas for standard deviation and variance are often used together for better data analysis.
Example of Relationship:
If variance = 16, then standard deviation = √16 = 4.
If standard deviation = 5, then variance = 25.
This shows how closely related variance and standard deviation are in statistical theory.
To master how to calculate variance, follow the systematic steps outlined below:
Step 1: Find the Mean (Average)
Add all the data values and divide by the number of values.
Step 2: Subtract the Mean from Each Data Point
This gives the deviation of each point from the mean.
Step 3: Square Each Deviation
This ensures all differences are positive and emphasizes larger deviations.
Calculate the Average of the Squared Deviations
For population: divide by N
For sample: divide by (n – 1)
Let’s break this down with a variance example.
Variance Example 1
Dataset: 4, 6, 8, 10, 12
Mean = (4 + 6 + 8 + 10 + 12) / 5 = 40 / 5 = 8
Deviations from the mean:
4 − 8 = -4
6 − 8 = -2
8 − 8 = 0
10 − 8 = 2
12 − 8 = 4
Squared deviations:
(-4)² = 16
(-2)² = 4
0² = 0
2² = 4
4² = 16
Sum of squared deviations = 16 + 4 + 0 + 4 + 16 = 40
Sample variance = 40 / (5 - 1) = 40 / 4 = 10
Standard deviation = √10 ≈ 3.16
Variance Example 2
Dataset: 2, 4, 4, 4, 6, 8
Mean = (2 + 4 + 4 + 4 + 6 + 8) / 6 = 28 / 6 ≈ 4.67
Deviations from the mean:
2 − 4.67 = -2.67
4 − 4.67 = -0.67
4 − 4.67 = -0.67
4 − 4.67 = -0.67
6 − 4.67 = 1.33
8 − 4.67 = 3.33
Squared deviations:
(-2.67)² ≈ 7.13
(-0.67)² ≈ 0.45
(-0.67)² ≈ 0.45
(-0.67)² ≈ 0.45
(1.33)² ≈ 1.77
(3.33)² ≈ 11.09
Sum = 7.13 + 0.45 + 0.45 + 0.45 + 1.77 + 11.09 ≈ 21.34
Sample variance = 21.34 / (6 - 1) = 21.34 / 5 ≈ 4.27
Standard deviation = √4.27 ≈ 2.07
Variance Example 3
Dataset: 3, 3, 3, 3
Mean = (3 + 3 + 3 + 3) / 4 = 12 / 4 = 3
Deviations:
Each value − 3 = 0
Squared deviations = 0 for each
Sum of squared deviations = 0
Population variance = 0 / 4 = 0
Standard deviation = √0 = 0
This shows when all numbers are the same, variance is 0.
This is a simple variance example that illustrates each calculation step and reinforces the variance definition.
Problem 1:
The ages of five employees are: 25, 30, 35, 40, 45. Find the variance and standard deviation.
Solution:
Step 1: Find the Mean
Mean = (25 + 30 + 35 + 40 + 45)/5 = 175/5 = 35
Step 2: Deviation from Mean
25 – 35 = -10
30 – 35 = -5
35 – 35 = 0
40 – 35 = 5
45 – 35 = 10
Step 3: Squared Deviations
100, 25, 0, 25, 100
Step 4: Sum of Squares
Sum = 250
Population Variance:
σ² = 250 / 5 = 50
Sample Variance:
s² = 250 / 4 = 62.5
Standard Deviation:
√62.5 ≈ 7.91
So, the variance is 62.5 (sample) and the standard deviation is approximately 7.91. This solved example illustrates how to calculate variance in real-world datasets.
Problem 2:
The marks scored by 5 students in a math test are: 45, 50, 55, 60, and 65.
Find the sample variance and standard deviation of the marks.
Solution:
Find the mean:
(45 + 50 + 55 + 60 + 65) / 5 = 275 / 5 = 55
Find the deviation of each mark from the mean:
45 − 55 = -10
50 − 55 = -5
55 − 55 = 0
60 − 55 = 5
65 − 55 = 10
Square each deviation:
(-10)² = 100
(-5)² = 25
0² = 0
5² = 25
10² = 100
Add the squared deviations:
100 + 25 + 0 + 25 + 100 = 250
Divide by (n − 1):
Sample size n = 5
Sample variance = 250 / (5 − 1) = 250 / 4 = 62.5
Standard deviation = √62.5 ≈ 7.91
Final Answer:
Sample variance = 62.5
Standard deviation ≈ 7.91
Here are key takeaways about variance and its applications:
Variance Definition: It measures the average squared deviation from the mean.
How to Calculate Variance: Follow steps that include finding the mean, calculating deviations, squaring those, and averaging.
Variance Formula: There are different formulas for sample and population data.
Variance and Standard Deviation: These are closely related; the standard deviation is the square root of variance.
Variance Example: This always helps with understanding the concept; start with small data sets.
Standard Deviation and Variance Formula are often used interchangeably in descriptive statistics.
Variance is foundational; it is used in regression, probability theory, and hypothesis testing.
Finance: To calculate the risk or volatility of an investment.
Quality Control: Measuring consistency in product manufacturing.
Education: Looking at how students perform in exams.
Sports Analytics: Evaluating how players' performance varies.
In summary, variance is a statistical tool that measures how data values spread out from the mean. Knowing what variance means, understanding the variance formula, and learning how to calculate it are important skills for students, analysts, and researchers.
The link between variance and standard deviation helps us make sense of dispersion in simple terms. By looking at several variance examples and solved problems, these concepts become clearer and more applicable to real-life situations.
Whether in school or in the workplace, mastering variance can lead to a better understanding of how data behaves, its variability, and statistical inference.
Related Links
Standard Deviation - Learn how standard deviation helps measure how spread out numbers are in a data set.
Statistics - Understand the basics of statistics, including how to collect, organize, and interpret data.
Ans: Variance is a statistical measure that tells us how spread out the values in a dataset are from the mean (average).
In simple words, it shows how much the numbers differ from each other and from the average value. A low variance means the numbers are close to the mean, and a high variance means they are spread out.
Ans: Variance is the average of the squared differences from the mean of a dataset.
For example, if you have a dataset like [4, 6, 8], the variance will measure how far each number is from the average (mean), and it’s calculated by squaring those differences and averaging them.
Ans: Mean = (1 + 2 + 3 + 4 + 5) / 5 = 15 / 5 = 3
Deviations from the mean:
1 − 3 = -2
2 − 3 = -1
3 − 3 = 0
4 − 3 = 1
5 − 3 = 2
Squared deviations:
(-2)² = 4
(-1)² = 1
0² = 0
1² = 1
2² = 4
Sum = 4 + 1 + 0 + 1 + 4 = 10
If using population variance:
σ² = 10 / 5 = 2
If using sample variance:
s² = 10 / (5 - 1) = 10 / 4 = 2.5
Ans: Mean = (3 + 5 + 7 + 9 + 11) / 5 = 35 / 5 = 7
Deviations from the mean:
3 − 7 = -4
5 − 7 = -2
7 − 7 = 0
9 − 7 = 2
11 − 7 = 4
Squared deviations:
(-4)² = 16
(-2)² = 4
0² = 0
2² = 4
4² = 16
Sum = 16 + 4 + 0 + 4 + 16 = 40
Population variance:
σ² = 40 / 5 = 8
Sample variance:
s² = 40 / 4 = 10
Ans: Mean = (2 + 4 + 5 + 6 + 8 + 17) / 6 = 42 / 6 = 7
Deviations from the mean:
2 − 7 = -5
4 − 7 = -3
5 − 7 = -2
6 − 7 = -1
8 − 7 = 1
17 − 7 = 10
Squared deviations:
(-5)² = 25
(-3)² = 9
(-2)² = 4
(-1)² = 1
1² = 1
10² = 100
Sum = 25 + 9 + 4 + 1 + 1 + 100 = 140
Population variance:
σ² = 140 / 6 ≈ 23.33
Sample variance:
s² = 140 / (6 - 1) = 140 / 5 = 28
Strengthen your understanding of variance, explore real-life examples, and make learning maths enjoyable with Orchids The International School!