Author avatar

Kimaru Thagana

Data Science on the Web with Streamlit

Kimaru Thagana

  • Sep 21, 2020
  • 4 Min read
  • Sep 21, 2020
  • 4 Min read
Data Analytics
Machine Learning


Data science involves using scientific means in extracting and presenting information and insights to relevant stakeholders. Custom data science tools are often developed as programs or scripts in languages such as Python. To communicate their insights, data scientists may find it hard sharing scripts and programs, especially if the audience is non-technical. They may also require a web-based resource to reach a wider audience.

The need for performing data science on the web using a simple and minimal interface is what birthed the open source Python library Streamlit. Streamlit is highly regarded as the fastest way to build data apps for the web by data scientists, data analysts, machine learning engineers, and business intelligence developers.

This guide will explore the library via a sample app.

Consider the scenario where you are the data scientist for your startup. You have been selected to design and develop an interactive tool for your company's client, a winery that has collected data on the chemical components of their wines. The company wishes to have an interface where they can interact with the dataset, build simple visualizations on demand, and filter columns from the data.

The client is not available for a physical meeting but open to a remote one. They wish to have an interactive interface where they can make simple selections. For this task, you choose to use Streamlit to host the interface for sharing with the client.

This guide assumes you have at least intermediate knowledge in Python and have some experience in data science.


To get started with Streamlit, download it via the command below in your terminal.

1pip install streamlit
2streamlit hello

If Streamlit opens and runs on the browser, the installation was successful. Create a new Python file and name it

Sample App

Import the required libraries and set up the page title and welcome text.

1import streamlit as st
2import pandas as pd
4st.title("Winery Inc. Welcome")
5st.header("Data Visualization Board for Winery Inc Chemical Components")
6wine = pd.read_csv("wine_data.csv")

Sample Data

The data below is in CSV format and is an extract from the wine dataset. Generate more values to build a sample dataset and save the file as wine_data.csv

1date,pH,sulphur dioxide,acidity,color

Visualize the dataframe and give the option of selecting which columns the user would like to see.

1selected_columns = st.multiselect('Select desired Columns', wine.columns.to_list(), default=['acidity','pH'])

Display a line graph with selected columns as input. Allowing selections gives the user the ability to compare more than one column within the visualization. The date column shows the day the readings were recorded. This column is made to be the index and the X axis of the chart


With the above setup, the interface allows the user to select columns to view in the dataset and in the graph.

Running the Project

To run your project, execute the following command on your terminal

1streamlit run

The project will be running on your default browser at the local address and port localhost:8501


Bringing data science and analytics to the web via Streamlit allows you to easily share your data science projects and research with colleagues and the general public. This skill is best applied in job positions such as data analyst, data scientist and business intelligence developer.

To further build on this guide, sign up for a free account on Heroku and follow this guide to upload your data science machine learning or data analytics project.