 Chhaya Wagmi

# Working with Statistics Function in Excel

• Sep 14, 2020
• 2,160 Views
• Sep 14, 2020
• 2,160 Views
Microsoft Excel

## Introduction

In this guide, you will learn about various statistical functions that are available in Excel 2019. By the end of this guide, you'll be familiar with some important statistical functions with examples. Later, you can explore the rest of the functions using the Excel provided documentation.

The functions which you are going to learn in this guide are:

• AVERAGE
• AVERAGEIF
• MEDIAN
• PERMUT
• COUNT
• COUNTIF
• FREQUENCY
• STDEV, and VAR
• PERCENTILE.INC, QUARTILE.INC, and RANK.EQ

## The AVERAGE Function

To start with, you have the most commonly used function, the AVERAGE. It gives you an average of `n` numbers by summing them up and dividing by the sum of the total numbers. Many beginners believe that they can only get the average of two numbers but this is not the case. Here's the syntax for the function:

``1=AVERAGE(n1, n2, n3 ... nn)``

To elaborate this function with an example, consider the given tabular data which consists of three students' marks data. You need to find the average marks:

ABCD
Subject 1Subject 2Subject 3Result
263878?
657811?
128850?

To find the average marks for each student, you need to deploy the given formula in D1 and then stretch it all the way down for rest of the two, `=AVERAGE(A1, B1, C1)`. This results in the following answer:

ABCD
Subject 1Subject 2Subject 3Result
26387847.33
65781151.33
12885050

## The AVERAGEIF Function

Now, in the above example, consider that you only need to average those subject marks where the student has scored more than 50. Such a result can't be calculated using the `AVERAGE` function, therefore, you need a new function, the `AVERAGEIF` function. The syntax is given below:

``1=AVERAGEIF(cell_range, condition, [opt_avg_range])``

Form a tabular data for a student whose marks, in four out of five subjects, are known. You need to find the average marks, considering only those subjects where she has scored more than 50.

AB
SubjectsMarks
Subject 126
Subject 265
Subject 312
Subject 4
Subject 578

Now, in any new cell, you can write the given formula to arrive at the result:

``1=AVERAGEIF(B1:B5, ">50")``

This gives you the result as 71.5 by avoiding the marks for the subjects 1, 3, and 4.

## The MEDIAN Function

At the back-end, the `AVERAGE` function uses the `mean` concept in calculation. This has a disadvantage if their is an anomaly in the data. For example, consider a case where three doctors report their number of patients analyzed in five consecutive days, as shown:

ABCDE
Day 1Day 2Day 3Day 4Day 5
812161087
121510119
84102

If you try to average out the number of patients then the anomaly (87) for the first doctor can bring skewness in the result. Hence, in such scenarios, you should prefer to use `median`; the syntax in Excel is shown below:

``1=MEDIAN(n1, n2, n3 ... nn)``

To apply the median on the table, you can proceed as `=MEDIAN(A1, B1, C1, D1, E1)` which results in the answer as 12. If you compare this answer with the `=AVERAGE(A1, B1, C1, D1, E1)` answer, 26.6 = 27 makes more sense.

## The PERMUT Function

Permutation is a widely used concept in the realm of statistics. You can perform the permutation on a given data using the given syntax in Excel:

``1=PERMUT(n, n_chosen)``

To understand what permutation is and how you can implement it in Excel, consider a case of three athletes of which only two need to be selected for a marathon. Permutation helps in answering this question without counting all the possibilities by hand. If you represent this case in a tabular format, it will look like this:

AB
Number of athletesTo be chosen
32

You can implement the `PERMUT` function as shown:

``1=PERMUT(A1, B1)``

This gives the answer 6, stating that there are six possibilities for choosing two athletes from a group of three.

## The COUNTBLANK Function

The COUNTBLANK function returns "the number of blank cells in a given specified range of cells," according to Excel's documentation. This can be a very useful function when you need to know how many values are missing in data.

The syntax for the COUNTBLANK function is given below:

``1=COUNTBLANK(range_of_cells)``

To implement it in Excel, consider the given tabular data:

A
Values
9
5
8
7

So, if you implement `=COUNTBLANK(A1:A7)`, it counts the number of blank values. The result of the formula is 3. Three represents the total of the one blank value between 9 and 5 along with two blank values between 5 and 8.

## The COUNTIF Function

Assume you need to calculate the number of people whose marks are above a cut-off decided at value 88. To do this, you can use the `COUNTIF` function whose syntax is given below:

``1=COUNTIF(range_of_cells)``

To illustrate the scenario, consider the given tabular data:

A
Marks
50
90
23
65
98
55

On the given tabular data, implement the following formula:

``1=COUNTIF(A1:A7, ">88")``

This results in the value 2 for the values 90 and 98. Note that the function has also ignored the blank cell.

## The FREQUENCY Function

Consider a column with a large number of values ranging from 1-1000 and you need to count how many values fall in the bins 0-270, 271-350, 351-888, and 888-1000. If you have a small number of values available then this can be done manually. However, for a large set of numbers, it is suggested to use the FREQUENCY function. Its syntax is given below:

``1=FREQUENCY(value_range, bin_range)``

Consider this tabular data to implement the function:

AB
ValuesBins
255270
615350
478888
241000
11
998
548
250
341
880
1222
650
555

To implement the `FREQUENCY` function, you first need to learn the array function. The given `FREQUENCY` function doesn't result in only one item, rather it results in one plus the number of bin values. In our table, we have four values available in Bins, so the output of the `FREQUENCY` function will have five values which form an array. Therefore, first, you need to select five vertical cells and write the given formula. The formula should be written in the first cell:

``1=FREQUENCY(A1:A13, B1:B4)``

Once you write the formula, press `Ctrl + Shift + Enter`. This will result in the following answer and will also write the above formula within the curly braces, `{}`, of all the cells .

ABC
ValuesBinsFrequency
2552704
6153501
4788886
2410001
111
998
548
250
341
880
1222
650
555

You can refer to this short YouTube video to learn how to use the array function.

## The STDEV and VAR Functions

So far, you have learned about the functions that either belong to the measure of central tendency or are commonly used. In this section, you will explore two statistical functions that belong to the dispersion class i.e., the variance and standard deviation of a sample collected from a population data.

Their respective syntax is presented below:

``````1=STDEV(data_array)
2=VAR(data_array)``````

Consider a sample of five values taken from a population data as shown:

A
50
98
12
233
1

To find the variance and standard deviation in this data, you can implement the functions as shown:

``````1=VAR(A1:A5)
2=STDEV(A1:A5)``````

This results in the values are Variance = 8872.7 and Standard Deviation = 94.19.

## The PERCENTILE.INC, QUARTILE.INC and RANK.EQ Functions

The `PERCENTILE.INC` function returns "the K'th percentile of values in a provided range, where K lies in the range 0 - 1 (both inclusive)," according to Excel's documentation.

The `QUARTILE.INC` function returns "the specified quartile of a set range of provided numbers, based on percentile value 0 - 1 (both inclusive)," according to Excel's documentation.

Here's the list of available percentiles that you can attain through the `QUARTILE.INC` function:

ArgumentResult
0Minimum value
1First quartile (25th percentile)
2Second quartile (50th percentile). Also, the median.
3Third quartile (75th percentile)
4Maximum quartile

The `RANK.EQ` function returns the mode of a list of provided numbers. FYI, mode means the number with the most frequency. Also, if more than one value has the same rank, the top rank of that set is returned.

Here're the syntax of these functions:

``````1=PERCENTILE.INC(data, percentile)
2=QUARTILE.INC(data, quartile)
3=RANK.EQ(number_whose_rank_is_to_be_calculated, reference, [order])``````

Consider the below tabular data to implement these functions:

A
5
5
8
9
7
1
11
4
2
5

To find the 80th percentile of the data, use the following:

``1=PERCENTILE.INC(A1:A10, 0.8)``

This gives the value as 8.2.

To find the 50th percentile using the `QUARTILE.INC` function, pass the value 2 to the `quartile` argument:

``1=QUARTILE.INC(A1:A10, 2)``

That gives the value as 5.

To find the rank of number 7, proceed with the following formula:

``1=RANK.EQ(A5, A1:A10)``

That gives you the rank of number 7 as 4.

## Conclusion

In this guide, you've learned various important statistical functions that are commonly used in Excel. So, go ahead and start exploring them in your own scenarios.