Pygal is a Python API that enables us to build scalar vector graphic (SVG) graphs and charts in a variety of styles. In this guide, we will learn how to visualize dynamic and interactive charts from datasets using a combination of the Pandas and Pygal libraries. We'll start with the basics of Pygal, such as how to use it to apply different methods to visualize data and how different types of visualizations take their input data. Then we'll integrate both Pygal and Pandas.
For more information about Pygal, read my two previous guides:
There are no dependencies for installing Pygal. It's available for Python 2.7+. For this guide, I am assuming that you have Python and pip installed on your system.
pip
, open the terminal and run the following command:1pip install pygal
conda
, open the terminal and run the following command:1conda install -c conda-forge pygal
We will go through multiple examples to get an idea of what Pygal expects as input data. The core idea is to let Pandas create the data in a format that Pygal's visualizations can consume easily.
I assume you have basic knowledge of python, have installed a Python 3+ version on your system, and have a web browser.
For the basic bar graph, we have to call the pygal.Horizontalbar()
object.
1# First import pygal
2import pygal
3# Then create a bar graph object
4bar_chart = pygal.Bar()
5# Then create a bar graph
6bar_chart = pygal.HorizontalBar()
7bar_chart.title = 'Random data'
8bar_chart.add('A', 20.5)
9bar_chart.add('B', 36.7)
10bar_chart.add('C', 6.3)
11bar_chart.add('D', 4.5)
12bar_chart.add('E', 80.3)
13# rendering the file
14line_chart.render_to_file('Horizontal bar chart.svg')
For the basic dot chart, we have to call the pygal.Dot()
object.
In this dots chart, you can see that for each line chart category ("A" or "B), we need to call the add
function and provide the data.
1#importing pygal
2import pygal
3# creating a object
4dot_chart = pygal.Dot(x_label_rotation=30)
5# nameing a title
6dot_chart.title = 'A vs B vs C vs D'
7# namind different labels
8dot_chart.x_labels = ['Richards', 'DeltaBlue', 'Crypto', 'RayTrace', 'EarleyBoyer', 'RegExp', 'Splay', 'NavierStokes']
9# adding the data
10dot_chart.add('A', [6395, 2212, 7520, 7218, 12464, 1660, 2123, 8607])
11dot_chart.add('B', [8473, 9099, 1100, 2651, 6361, 1044, 3797, 9450])
12dot_chart.add('C', [3472, 2933, 4503, 5229, 5510, 1828, 9013, 4669])
13dot_chart.add('D', [43, 41, 59, 79, 144, 136, 34, 102])
14# rendering the file
15dot_chart.render_to_file("dot chart.svg")
I think now you have enough understanding of how different types of Pygal visualizations take data differently.
We saw in the above example how Pygal takes data in a specific format, and now we will take a public data set in CSV data format to build Pygal visualizations with the help of Pandas.
I assume you have basic knowledge of the Pandas library and have installed it on your system.
Here I have a dataset of mobile operating system market share worldwide in 2019. You can download it here.
We will first load the data using Pandas, as usual.
1import pandas as pd
2data_frame = pd.read_csv("Mobile Operating System Market Share Worldwide.csv")
3data_frame.head()
Now we will manipulate or arrange this data according to the type of chart we want to plot in Pygal.
Now we will plot the above examples on a public data set, which we have taken using Pandas and Pygal.
In this line chart, we will compare the four most popular operating systems: Android, iOS, KaiOS, and Samsung.
Before starting, we have to change the data type of the columns we want to use for visualization.
In Pandas, we can typecast those columns which we want to use while reading the data with the help of Pandas.
1# Importing pandas library.
2import pandas as pd
3# Reading and type casting columns in our csv file.
4data_frame = pd.read_csv("Mobile Operating System Market Share Worldwide.csv",
5 dtype ={
6 "Date" : str,
7 "Android": float,
8 "iOS": float,
9 "KaiOS": float,
10 "Samsung": float
11 })
After typecasting, we are ready to use our dataset for visualizations.
1# importing pygal library
2import pygal
3# we will append data in list
4a = []
5b = []
6c = []
7d = []
8# creating object
9line_chart = pygal.Line()
10# naming the title
11line_chart.title = 'Android, iOS, KaiOS and Samsung usage in 2019 months wise'
12# adding range of months from 1 to 12
13line_chart.x_labels = map(str, range(1, 12))
14for index,row in data_frame.iterrows():
15 a.append(row["Android"])
16 b.append(row["iOS"])
17 c.append(row["KaiOS"])
18 d.append(row["Samsung"])
19# adding the
20line_chart.add('Android', a)
21line_chart.add('iOS', b)
22line_chart.add('KaiOS', c)
23line_chart.add('Samsung', d)
24# rendering the file
25line_chart.render_to_file("line_chart.svg")
There is no need to manipulate our data for a stacked line chart.
1# importing pygal library
2import pygal
3# we will append data in list
4a = []
5b = []
6c = []
7d = []
8# creating object
9line_chart = pygal.StackedLine(fill=True)
10# naming the title
11line_chart.title = 'Android, iOS, KaiOS and Samsung usage in 2019 months wise'
12# adding range of months from 1 to 12
13line_chart.x_labels = map(str, range(1, 12))
14for index,row in data_frame.iterrows():
15 a.append(row["Android"])
16 b.append(row["iOS"])
17 c.append(row["KaiOS"])
18 d.append(row["Samsung"])
19# adding the
20line_chart.add('Android', a)
21line_chart.add('iOS', b)
22line_chart.add('KaiOS', c)
23line_chart.add('Samsung', d)
24# rendering the file
25line_chart.render_to_file("Stacked line chart.svg")
We can also plot an underlying bar graph using the above method.
1# importing pygal library
2import pygal
3# we will append data in list
4a = []
5b = []
6c = []
7d = []
8# creating object
9bar_chart = pygal.Bar()
10# naming the title
11bar_chart.title = 'Android, iOS, KaiOS and Samsung usage in 2019 months wise'
12# adding range of months from 1 to 12
13bar_chart.x_labels = map(str, range(1, 12))
14for index,row in data_frame.iterrows():
15 a.append(row["Android"])
16 b.append(row["iOS"])
17 c.append(row["KaiOS"])
18 d.append(row["Samsung"])
19# adding the apeended list
20bar_chart.add('Android', a)
21bar_chart.add('iOS', b)
22bar_chart.add('KaiOS', c)
23bar_chart.add('Samsung', d)
24# rendering the file
25bar_chart.render_to_file("Basic Bar chart.svg")
For this chart, we have to add labels to identify our data for each month.
1# importing pygal library
2import pygal
3# we will append data in list
4a = []
5b = []
6c = []
7d = []
8# creating dot chart object
9dot_chart = pygal.Dot(x_label_rotation=30)
10# naming the titile
11dot_chart.title = 'Android, iOS, KaiOS and Samsung usage in 2019 months wise'
12dot_chart.x_labels = ['january', 'february', 'March', 'April', 'May', 'June', 'July', 'August', 'October', 'November', 'December']
13for index,row in data_frame.iterrows():
14 a.append(row["Android"])
15 b.append(row["iOS"])
16 c.append(row["KaiOS"])
17 d.append(row["Samsung"])
18# adding the apeended list
19dot_chart.add('Android', a)
20dot_chart.add('iOS', b)
21dot_chart.add('KaiOS', c)
22dot_chart.add('Samsung', d)
23# rendering the file
24dot_chart.render_to_file("dots chart.svg")
As we have seen in the above basic example of a horizontal bar graph, it takes data different from the line chart object.
First, we will again import the CSV file by Pandas and transpose our dataset using the data_frame.T
function of Pandas.
1# importing pandas
2import pandas as pd
3# Reading the Data Frame
4data_frame = pd.read_csv("Mobile Operating System Market Share Worldwide.csv")
5# we will transpose the data set
6data_frame = data_frame.T
7data_frame.head()
Now we will drop the date row using the drop()
function of Pandas.
1data_frame.drop(['Date'])
You can see no name is shown for the first column, and we will save and read the saved file again.
1data_frame.to_csv(r'Os market share world wide.csv')
2data_frame = pd.read_csv("Os market share world wide.csv")
3data_frame
As you can see, after saving and rereading it, Pandas has automatically indexed rows as well.
Now we will name the unnamed column.
1data_frame.rename( columns={'Unnamed: 0':'new column'}, inplace=True )
So now, we are ready to plot our dataset with Pygal. The image below shows us our final dataset.
The column names 0 to 11 tell us the usage from January 2019 to December 2019 .
Before going to the visualization of the month of our choice, we need to typecast that column using the astype()
function of Pandas.
1data_frame = data_frame.astype({"new column":'str', "0":'float64'})
Now we will plot our dataset using Pygal.
1# importing the pygal library
2import pygal
3# creating horizontal bar chart object
4bar_chart = pygal.HorizontalBar()
5# naming the tiltile
6bar_chart.title = 'usage in month of january'
7# we are adding the data from our dataset by iterating over the data set using itterrow()
8for index,row in data_frame.iterrows():
9 bar_chart.add(row["new column"], row["0"])
10# rendering the file
11bar_chart.render_to_file("horizontal bar chart.svg")
We will build the pie chart for July. To do that, we have to typecast data for the July month.
1df = df.astype({ "6":'float64'})
Now we are ready to plot the pie chart.
1# importing the pygal library
2import pygal
3# creating horizontal bar chart object
4pie_chart = pygal.Pie()
5# naming the tiltile
6pie_chart.title = 'usage in month of july'
7# we are adding the data from our dataset by iterating over the data set using itterrow()
8for index,row in df.iterrows():
9 pie_chart.add(row["new column"], row["6"])
10# rendering the file
11pie_chart.render_to_file("pie chart.svg")
In this guide, we used the Pygal and Pandas libraries to plot resourceful, dynamic, and interactive visualizations using different types of methods. We saw how easy it is to integrate these two libraries for interactive map plotting from a CSV format dataset. I found this combination easy and powerful. I hope after reading this guide, you have learned some simple tricks to integrate these two powerful libraries of Python.
If you have any queries related to this guide, feel free to ask at Codealphabet.