Author avatar

Gaurav Singhal

Building Visualizations with Pygal

Gaurav Singhal

  • Feb 20, 2020
  • 12 Min read
  • 742 Views
  • Feb 20, 2020
  • 12 Min read
  • 742 Views
Data
Pygal

Introduction

Pygal is a Python API that enables us to build scalar vector graphic (SVG) graphs and charts in a variety of styles. In this guide, we will learn how to visualize dynamic and interactive charts from datasets using a combination of the Pandas and Pygal libraries. We'll start with the basics of Pygal, such as how to use it to apply different methods to visualize data and how different types of visualizations take their input data. Then we'll integrate both Pygal and Pandas.

For more information about Pygal, read my two previous guides:

Installation of Pygal

There are no dependencies for installing Pygal. It's available for Python 2.7+. For this guide, I am assuming that you have Python and pip installed on your system.

  • pip install: To install using pip, open the terminal and run the following command:
1
pip install pygal
shell
  • conda Install: To install using conda, open the terminal and run the following command:
1
conda install -c conda-forge pygal
shell

Pygal Basics

We will go through multiple examples to get an idea of what Pygal expects as input data. The core idea is to let Pandas create the data in a format that Pygal's visualizations can consume easily.

I assume you have basic knowledge of python, have installed a Python 3+ version on your system, and have a web browser.

Horizontal Bar Chart

For the basic bar graph, we have to call the pygal.Horizontalbar() object.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# First import pygal
import pygal
# Then create a bar graph object
bar_chart = pygal.Bar()
# Then create a bar graph
bar_chart = pygal.HorizontalBar()
bar_chart.title = 'Random data'
bar_chart.add('A', 20.5)
bar_chart.add('B', 36.7)
bar_chart.add('C', 6.3)
bar_chart.add('D', 4.5)
bar_chart.add('E', 80.3)
# rendering the file
line_chart.render_to_file('Horizontal bar chart.svg')
python

Dot Chart

For the basic dot chart, we have to call the pygal.Dot() object.

In this dots chart, you can see that for each line chart category ("A" or "B), we need to call the add function and provide the data.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#importing pygal 
import pygal
# creating a object
dot_chart = pygal.Dot(x_label_rotation=30)
# nameing a title
dot_chart.title = 'A vs B vs C vs D'
# namind different labels
dot_chart.x_labels = ['Richards', 'DeltaBlue', 'Crypto', 'RayTrace', 'EarleyBoyer', 'RegExp', 'Splay', 'NavierStokes']
# adding the data
dot_chart.add('A', [6395, 2212, 7520, 7218, 12464, 1660, 2123, 8607])
dot_chart.add('B', [8473, 9099, 1100, 2651, 6361, 1044, 3797, 9450])
dot_chart.add('C', [3472, 2933, 4503, 5229, 5510, 1828, 9013, 4669])
dot_chart.add('D', [43, 41, 59, 79, 144, 136, 34, 102])
# rendering the file 
dot_chart.render_to_file("dot chart.svg")
python

I think now you have enough understanding of how different types of Pygal visualizations take data differently.

Data

We saw in the above example how Pygal takes data in a specific format, and now we will take a public data set in CSV data format to build Pygal visualizations with the help of Pandas.

I assume you have basic knowledge of the Pandas library and have installed it on your system.

Here I have a dataset of mobile operating system market share worldwide in 2019. You can download it here.

We will first load the data using Pandas, as usual.

1
2
3
import pandas as pd
data_frame = pd.read_csv("Mobile Operating System Market Share Worldwide.csv")
data_frame.head()
python

Now we will manipulate or arrange this data according to the type of chart we want to plot in Pygal.

Charts with Pandas and Pygal

Now we will plot the above examples on a public data set, which we have taken using Pandas and Pygal.

Line Chart With CSV Data

In this line chart, we will compare the four most popular operating systems: Android, iOS, KaiOS, and Samsung.

Before starting, we have to change the data type of the columns we want to use for visualization.

In Pandas, we can typecast those columns which we want to use while reading the data with the help of Pandas.

1
2
3
4
5
6
7
8
9
10
11
# Importing pandas library.
import pandas as pd 
# Reading and type casting columns in our csv file.
data_frame = pd.read_csv("Mobile Operating System Market Share Worldwide.csv",
                        dtype ={
                            "Date" : str,
                            "Android": float,
                            "iOS": float, 
                            "KaiOS": float, 
                            "Samsung": float
                        })
python

After typecasting, we are ready to use our dataset for visualizations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# importing pygal library
import pygal
# we will append data in list 
a = []
b = []
c = []
d = []
# creating object
line_chart = pygal.Line()
# naming the title
line_chart.title = 'Android, iOS, KaiOS and Samsung usage in 2019 months wise'
# adding range of months from 1 to 12
line_chart.x_labels = map(str, range(1, 12))
for index,row in data_frame.iterrows():
    a.append(row["Android"])
    b.append(row["iOS"])
    c.append(row["KaiOS"])
    d.append(row["Samsung"])
# adding the     
line_chart.add('Android', a)
line_chart.add('iOS', b)
line_chart.add('KaiOS', c)
line_chart.add('Samsung', d)
# rendering  the file
line_chart.render_to_file("line_chart.svg")
python

Stacked Line Chart

There is no need to manipulate our data for a stacked line chart.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# importing pygal library
import pygal
# we will append data in list 
a = []
b = []
c = []
d = []
# creating object
line_chart = pygal.StackedLine(fill=True)
# naming the title
line_chart.title = 'Android, iOS, KaiOS and Samsung usage in 2019 months wise'
# adding range of months from 1 to 12
line_chart.x_labels = map(str, range(1, 12))
for index,row in data_frame.iterrows():
    a.append(row["Android"])
    b.append(row["iOS"])
    c.append(row["KaiOS"])
    d.append(row["Samsung"])
# adding the     
line_chart.add('Android', a)
line_chart.add('iOS', b)
line_chart.add('KaiOS', c)
line_chart.add('Samsung', d)
# rendering  the file
line_chart.render_to_file("Stacked line chart.svg")
python

Basic Bar Graph

We can also plot an underlying bar graph using the above method.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# importing pygal library
import pygal
# we will append data in list 
a = []
b = []
c = []
d = []
# creating object
bar_chart = pygal.Bar()
# naming the title
bar_chart.title = 'Android, iOS, KaiOS and Samsung usage in 2019 months wise'
# adding range of months from 1 to 12
bar_chart.x_labels = map(str, range(1, 12))
for index,row in data_frame.iterrows():
    a.append(row["Android"])
    b.append(row["iOS"])
    c.append(row["KaiOS"])
    d.append(row["Samsung"])
# adding the apeended list 
bar_chart.add('Android', a)
bar_chart.add('iOS', b)
bar_chart.add('KaiOS', c)
bar_chart.add('Samsung', d)
# rendering  the file
bar_chart.render_to_file("Basic Bar chart.svg")
python

Dot Chart

For this chart, we have to add labels to identify our data for each month.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# importing pygal library
import pygal
# we will append data in list 
a = []
b = []
c = []
d = []
# creating dot chart object
dot_chart = pygal.Dot(x_label_rotation=30)
# naming the titile
dot_chart.title = 'Android, iOS, KaiOS and Samsung usage in 2019 months wise'
dot_chart.x_labels = ['january', 'february', 'March', 'April', 'May', 'June', 'July', 'August', 'October', 'November', 'December']
for index,row in data_frame.iterrows():
    a.append(row["Android"])
    b.append(row["iOS"])
    c.append(row["KaiOS"])
    d.append(row["Samsung"])
# adding the apeended list 
dot_chart.add('Android', a)
dot_chart.add('iOS', b)
dot_chart.add('KaiOS', c)
dot_chart.add('Samsung', d)
# rendering  the file
dot_chart.render_to_file("dots chart.svg")
python

Horizontal Bar Chart

As we have seen in the above basic example of a horizontal bar graph, it takes data different from the line chart object.

First, we will again import the CSV file by Pandas and transpose our dataset using the data_frame.T function of Pandas.

1
2
3
4
5
6
7
# importing pandas 
import pandas as pd
# Reading the Data Frame  
data_frame = pd.read_csv("Mobile Operating System Market Share Worldwide.csv")
# we will transpose the data set
data_frame = data_frame.T
data_frame.head()
python

Now we will drop the date row using the drop() function of Pandas.

1
data_frame.drop(['Date'])
python

You can see no name is shown for the first column, and we will save and read the saved file again.

1
2
3
data_frame.to_csv(r'Os market share world wide.csv')
data_frame = pd.read_csv("Os market share world wide.csv")
data_frame
python

As you can see, after saving and rereading it, Pandas has automatically indexed rows as well.

Now we will name the unnamed column.

1
data_frame.rename( columns={'Unnamed: 0':'new column'}, inplace=True )
python

So now, we are ready to plot our dataset with Pygal. The image below shows us our final dataset.

The column names 0 to 11 tell us the usage from January 2019 to December 2019 .

Before going to the visualization of the month of our choice, we need to typecast that column using the astype() function of Pandas.

1
data_frame = data_frame.astype({"new column":'str', "0":'float64'})
python

Now we will plot our dataset using Pygal.

1
2
3
4
5
6
7
8
9
10
11
# importing the pygal library
import pygal
# creating horizontal bar chart object
bar_chart = pygal.HorizontalBar()
# naming the tiltile
bar_chart.title = 'usage in month of january'
# we are adding the data from our dataset by iterating over the data set using itterrow()
for index,row in data_frame.iterrows():
    bar_chart.add(row["new column"], row["0"])
# rendering the file    
bar_chart.render_to_file("horizontal bar chart.svg")
python

Pie Chart

We will build the pie chart for July. To do that, we have to typecast data for the July month.

1
df = df.astype({ "6":'float64'})
python

Now we are ready to plot the pie chart.

1
2
3
4
5
6
7
8
9
10
11
# importing the pygal library
import pygal
# creating horizontal bar chart object
pie_chart = pygal.Pie()
# naming the tiltile
pie_chart.title = 'usage in month of july'
# we are adding the data from our dataset by iterating over the data set using itterrow()
for index,row in df.iterrows():
    pie_chart.add(row["new column"], row["6"])
# rendering the file    
pie_chart.render_to_file("pie chart.svg")
python

Conclusion

In this guide, we used the Pygal and Pandas libraries to plot resourceful, dynamic, and interactive visualizations using different types of methods. We saw how easy it is to integrate these two libraries for interactive map plotting from a CSV format dataset. I found this combination easy and powerful. I hope after reading this guide, you have learned some simple tricks to integrate these two powerful libraries of Python.

If you have any queries related to this guide, feel free to ask at Codealphabet.

2