Author avatar

Deepika Singh

Build Scatter Chart in Power BI

Deepika Singh

  • Nov 24, 2020
  • 7 Min read
  • 1,327 Views
  • Nov 24, 2020
  • 7 Min read
  • 1,327 Views
Microsoft Power BI
Data
Business Intelligence
Data Analytics

Introduction

A scatter plot is a very useful chart to visualize the relationship between two numerical variables. It is used in inferential statistics to visually examine correlation between two variables. This guide will demonstrate how to build a scatter plot, format it, and add dimensions to the chart with the analytics pane of Power BI Desktop.

Data

In this guide, you will work with a fictitious data set of bank loan disbursal across years. The data contains 3,000 observations and 17 variables. You can download the dataset here. The major variables are described below:

  1. Date: Loan disbursal date.
  2. Loan_disbursed: Loan amount (in US dollars) disbursed by the bank.
  3. Age: The applicant’s age in years.
  4. Gender: Whether the applicant is female (F) or male (M).
  5. Purpose: Purpose for which loan was taken.
  6. Weeknum: Week number of the year.

Start by loading the data.

Loading Data

Once you open your Power BI Desktop, the following output is displayed.

b1

Click on Get data option and select Excel from the options.

b2

Browse to the location of the file and select it. The name of the file is BIdata.xlsx, and the sheet you will load is BIData sheet. The preview of the data is shown, and once you are satisfied that you are loading the right file, click Load.

b3

You have loaded the file, and you can save the dashboard. It is named PowerBI Visualization. The Fields pane contains the variables of the data.

b4

Creating Scatter Plot

To begin, click on the Scatter chart option located in the Visualizations pane. This creates a chart box in the canvas. Nothing is displayed because you are yet to add the required visualization arguments.

sp1

You can resize the chart on the canvas. The next step is to fill the visualization arguments under the Fields option as shown below. Drag the variable Age into the X Axis field, and Loan_disbursed in the Y Axis field. Next, fill the Details field with the Weeknum variable.

sp2

The output above shows the scatterplot of Age and Loan_disbursed by Weeknum. But there is a mistake in the chart. The variables Age and Loan_disbursed are getting added, instead of being an average. To make this correction, click on X Axis field and select Average.

sp4

Repeat the same for Loan_disbursed variable placed in the Y Axis.

sp5

This will create the scatterplot. There are options to format the plot under the Format tab.

sp6

If you want to change the title formatting, you can go to the Format tab, and change the Alignment. You can also set the Text size to 16.

sp7

You have the desired scatter plot where each point represents the average age of the applicant, and the average loan disbursed, for a week number.

SP8

Adding Analytics to the Scatter Plot

Power BI also provides the option to add analytics to the scatter chart with the Analytics pane.

SP9

To begin, you can add Trend line to the chart. Click on Add.

SP10

Select the Color, Transparency level, and Style options as shown in the chart below, or as per your preference. This will create the following output.

SP11

If you look at the scatter plot, you can examine that most of the applicants are of the average age between 44 and 53 years. If you want to add a fixed line around X axis, you can select the X Axis Constant Line option.

SP12

Click on Add and set Value to 44. Also, keep the Color and Transparency options as shown below.

SP13

You can repeat the above process to add the X Axis Constant Line 2 and set the Value to 53.

SP14

The output above shows the scatter plot enriched with the analytics of a trend line and two fixed X axis lines.

Conclusion

Scatter plot is an important visualization chart in business intelligence and analytics. It is used in inferential statistics to visually examine the extent of linear relationship between two numerical variables. It is also used to identify and treat outliers which is a data pre-processing element in data science. In business intelligence, you will be required to build these charts for a variety of areas, such as market research, causal inference, business statistics, machine learning, exploratory data analysis, time series analysis and many more. This is a great skill set to add because you will be using it regularly across sectors and organizational functions.

To learn more about building powerful visualization in Power BI Desktop, please refer to the following guides:

7