Author avatar

Chhaya Wagmi

Working with Statistics Function in Excel

Chhaya Wagmi

  • Aug 12, 2019
  • 10 Min read
  • 6 Views
  • Aug 12, 2019
  • 10 Min read
  • 6 Views
Business Professional
Microsoft Excel

Introduction

In this guide, you will learn about various statistical functions that are available in Excel 2019. By the end of this guide, you'll be familiar with some important statistical functions with examples. Later, you can explore the rest of the functions using the Excel provided documentation.

The functions which we are going to learn in this guide are mentioned below:

  • AVERAGE
  • AVERAGEIF
  • MEDIAN
  • PERMUT
  • COUNT
  • COUNTIF
  • FREQUENCY
  • STDEV, and VAR
  • PERCENTILE.INC, QUARTILE.INC, and RANK.EQ

Statistical Functions

In this section, we will list out the above stated functions:

The AVERAGE Function

Let us start with the most commonly used function, the AVERAGE. It gives us the average of n numbers by summing them up and dividing by the sum of the total numbers. Many beginners believe that they can only get the average of two numbers but this is not the case. Here's the syntax for the function:

1
=AVERAGE(n1, n2, n3 ... nn)

To elaborate this function with an example, let’s consider the given tabular data which consists of three students' marks data. You need to find the average marks:

ABCD
Subject 1Subject 2Subject 3Result
263878?
657811?
128850?

To find the average marks for each student, you need to deploy the given formula in D1 and then stretch it all the way down for rest of the two, =AVERAGE(A1, B1, C1). This results in the following answer:

ABCD
Subject 1Subject 2Subject 3Result
26387847.33
65781151.33
12885050

The AVERAGEIF Function

Now, in the above example, consider that we only needed to average those subject marks where the student has scored more than 50. Such a result can't be calculated using the AVERAGE function, therefore, we have a need for a new function, the AVERAGEIF function. The syntax is given below:

1
=AVERAGEIF(range, criteria, [average_range])

Now, let us form a tabular data of a student whose marks, in four out of five subjects, are known. We need to find the average marks, considering only those subjects where she has scored more than 50.

AB
SubjectsMarks
Subject 126
Subject 265
Subject 312
Subject 4
Subject 578

Now, in any new cell, you can write the given formula to arrive at the result:

1
=AVERAGEIF(B1:B5, ">50")

This gives us the result as 71.5 by avoiding the marks for the subjects 1, 3, and 4.

The MEDIAN Function

At the back-end, the AVERAGE function uses the mean concept in calculation. This has a disadvantage if their is an anomaly in the data. For example, consider a case where three doctors report their number of patients analyzed in five consecutive days, as shown:

ABCDE
Day 1Day 2Day 3Day 4Day 5
812161087
121510119
84102

If we try to average out the number of patients then the anomaly (87) for the first doctor can bring skewness in the result. Hence, in such scenarios, we prefer to use median; the syntax in Excel is shown below:

1
=MEDIAN(n1, n2, n3 ... nn)

To apply the median on the table, you can proceed as =MEDIAN(A1, B1, C1, D1, E1) which results in the answer as 12. If we compare this answer with the =AVERAGE(A1, B1, C1, D1, E1) answer, 26.6 = 27 makes more sense.

The PERMUT Function

Permutation is a widely used concept in the realm of statistics. You can perform the permutation on a given data using the given syntax in Excel:

1
=PERMUT(n, n_chosen)

To understand what permutation is and how you can implement it in Excel, consider a case of three athletes of which only two need to be selected for a marathon. Permutation helps in answering this question without counting all the possibilities by hand. If we represent this case in a tabular format, it looks like this:

AB
Number of athletesTo be chosen
32

You can implement the PERMUT function as shown:

1
=PERMUT(A1, B1)

This gives the answer of 6, stating that there are six possibilities for choosing two athletes from a group of three.

The COUNTBLANK Function

The COUNTBLANK function returns the number of blank cells in a given specified range of cells. This can be a very useful function when you need to know how many values are missing in data.

The syntax for the COUNTBLANK function is given below:

1
=COUNTBLANK(range_of_cells)

To implement it in Excel, consider the given tabular data:

A
Values
9
5
8
7

So, if you implement =COUNTBLANK(A1:A7), it counts the number of blank values. The result of the formula is 3. Three represents the total of the one blank value between 9 and 5 along with two blank values between 5 and 8.

The COUNTIF Function

Let us assume you need to calculate the number of people whose marks are above a cut-off decided at value 88. To do this, we can use the COUNTIF function whose syntax is given below:

1
=COUNTIF(range_of_cells)

To illustrate the scenario, consider the given tabular data:

A
Marks
50
90
23
65
98
55

On the given tabular data, implement the following formula:

1
=COUNTIF(A1:A7, ">88")

This results in the value 2 for the values 90 and 98. Note that the function has also ignored the blank cell.

The FREQUENCY Function

Consider a column with a large number of values ranging from 1-1000 and you need to count how many values fall in the bins 0-270, 271-350, 351-888, and 888-1000. If you have a small number of values available then this can be done manually. However, for a large set of numbers, it is suggested to use the FREQUENCY function. Its syntax is given below:

1
=FREQUENCY(value_range, bin_range)

Now, let us consider a tabular data to implement the function:

AB
ValuesBins
255270
615350
478888
241000
11
998
548
250
341
880
1222
650
555

To implement the FREQUENCY function, you first need to learn the array function. The given FREQUENCY function doesn't result in only one item, rather it results in one plus the number of bin values. In our table, we have four values available in Bins, so the output of the FREQUENCY function will have five values which form an array. Therefore, first, you need to select five vertical cells and write the given formula. The formula should be written in the first cell:

1
=FREQUENCY(A1:A13, B1:B4)

Once you write the formula, press Ctrl + Shift + Enter. This will result in the following answer and will also write the above formula within the curly braces, {}, of all the cells .

ABC
ValuesBinsFrequency
2552704
6153501
4788886
2410001
111
998
548
250
341
880
1222
650
555

You can refer to this short YouTube video to learn how to use the array function.

The STDEV and VAR Functions

So far, we have learned about the functions that either belong to the measure of central tendency or are commonly used. Now, let us also explore to use two statistical functions that belong to the dispersion class. We are talking about the variance and standard deviation of a sample collected from a population data.

Their syntax is presented below:

1
2
=STDEV(data_array)
=VAR(data_array)

Let us consider a sample of five values taken from a population data as shown:

A
50
98
12
233
1

To find the variance and standard deviation in this data, we can implement the functions as shown:

1
2
=VAR(A1:A5)
=STDEV(A1:A5)

This results in the values as Variance = 8872.7 and Standard Deviation = 94.19.

The PERCENTILE.INC, QUARTILE.INC and RANK.EQ Functions

The PERCENTILE.INC function returns the K'th percentile of values in a provided range, where K lies in the range 0 - 1 (both inclusive).

The QUARTILE.INC function returns the specified quartile of a set range of provided numbers, based on percentile value 0 - 1 (both inclusive). Here's the list of available percentiles that we can attain through the QUARTILE.INC function:

ArgumentResult
0Minimum value
1First quartile (25th percentile)
2Second quartile (50th percentile). Also, the median.
3Third quartile (75th percentile)
4Maximum quartile

The RANK.EQ function returns the mode of a list of provided numbers. FYI, mode means the number with the most frequency. Also, if more than one value has the same rank, the top rank of that set is returned.

Here're the syntax of these functions:

1
2
3
=PERCENTILE.INC(array, k)
=QUARTILE.INC(array, quart)
=RANK.EQ(number_for_which_you_want_to_find_rank, ref, [order])

Let us consider the below tabular data to implement these functions:

A
5
5
8
9
7
1
11
4
2
5

To find the 80th percentile of the data, use the following:

1
=PERCENTILE.INC(A1:A10, 0.8)

This gives the value as 8.2.

To find the 50th percentile using the QUARTILE.INC function, pass the value 2 to the quart argument:

1
=QUARTILE.INC(A1:A10, 2)

That gives the value as 5.

To find the rank of number 7, proceed with the following formula:

1
=RANK.EQ(A5, A1:A10)

That gives us the rank of 7 as 4.

Conclusion

In this guide, you've learned various important statistical functions that are commonly used in Excel. So, go ahead and start exploring them in your own scenarios.

0