In this article, we will learn how to find outliers in Excel.
What are Outliers?
Outliers are the values in the data which are outside the scope of the general data values, meaning that they are very much higher than very much lower than the general data values.
Why do we need to remove the Outliers?
Outliers spoil our data and representation. When we create a graph of complete data, if some values are extremely high and extremely low, the other values/range of the graph do not look good.
Let us create simple data. In this example, we can clearly see that some values are really outside, e.g., 200.
Let’s discuss the procedure to remove outliers. Let’s start by finding the first and third quartile:
1st quartile: =QUARTILE(A2:A14,1)
3rd quartile: =QUARTILE(A2:A14,3)
How to calculate interquartile range?
Let’s calculate the interquartile range (IQR). Because it is a range, you just need to subtract values. Q3-Q1.
Calculating IQR gives us the possibility to calculate the lower and upper bounds of data.
Let us calculate the lower bound and upper bound values:
Lower bound = lower of Q1 or Q3 – 1.5 * IQR
Upper bound = max of Q1 or Q3 + 1.5 * IQR
Let us find the outlier by using the U bound and L bound: =OR(A2<$F$2,A2>$G$2)
Anything lower than the lower bound and higher than the upper bound is an outlier.
In Excel, to mark outliers, you can use a conditional formula like: =OR(value < (Q1 – 1.5 * IQR), value > (Q3 + 1.5 * IQR))