Candidate number: 31 Home Exam BAN438

In [1]:
import pandas as pd
import plotly.express as px
from pandas_datareader import wb      
import pandas_datareader.data as web  
from datetime import datetime, timedelta

from jupyter_dash import JupyterDash
from dash import dcc, html

import dash_bootstrap_components as dbc
from dash_bootstrap_templates import load_figure_template

from dash.dependencies import Input, Output
In [2]:
dbc_css = 'https://cdn.jsdelivr.net/gh/AnnMarieW/dash-bootstrap-templates@V1.0.2/dbc.min.css'
load_figure_template('bootstrap')

COVID-19 Dashboard

This notebook develops a dash application that allows users to explore how the COVID-19 situation has developed around the world

The notebook first compiles data on the development of COVID-19 cases, deaths and vaccinations by extracting data from Ourworldindata by using pandas-datareader. It then uses the data to create a dashboard with different visualizations that allow users to explore global COVID-19 cases, deaths and vaccinations.

Step 1: Read data

Read the data in Python from https://covid.ourworldindata.org/data/owid-covid-data.csv

In [3]:
#read the csv file
covid19_df = pd.read_csv("https://covid.ourworldindata.org/data/owid-covid-data.csv")
In [4]:
# import the necessary columns 
covid19_df = covid19_df[['iso_code', 'location', 'continent', 'date', 'new_cases', 'total_cases', 'total_cases_per_million', 
'new_deaths', 'total_deaths', 'total_deaths_per_million', 'new_vaccinations', 'people_fully_vaccinated',
'total_vaccinations_per_hundred', 'people_vaccinated_per_hundred']].copy()
In [60]:
covid19_df
Out[60]:
iso_code location continent date new_cases total_cases total_cases_per_million new_deaths total_deaths total_deaths_per_million new_vaccinations people_fully_vaccinated total_vaccinations_per_hundred people_vaccinated_per_hundred
0 AFG Afghanistan Asia 2020-02-24 5.0 5.0 0.126 NaN NaN NaN NaN NaN NaN NaN
1 AFG Afghanistan Asia 2020-02-25 0.0 5.0 0.126 NaN NaN NaN NaN NaN NaN NaN
2 AFG Afghanistan Asia 2020-02-26 0.0 5.0 0.126 NaN NaN NaN NaN NaN NaN NaN
3 AFG Afghanistan Asia 2020-02-27 0.0 5.0 0.126 NaN NaN NaN NaN NaN NaN NaN
4 AFG Afghanistan Asia 2020-02-28 0.0 5.0 0.126 NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
188783 ZWE Zimbabwe Africa 2022-05-19 199.0 250206.0 16578.529 1.0 5487.0 363.566 33825.0 4439317.0 76.88 41.05
188784 ZWE Zimbabwe Africa 2022-05-20 263.0 250469.0 16595.956 2.0 5489.0 363.699 17219.0 4442643.0 76.99 41.12
188785 ZWE Zimbabwe Africa 2022-05-21 0.0 250469.0 16595.956 0.0 5489.0 363.699 35059.0 4461860.0 77.22 41.19
188786 ZWE Zimbabwe Africa 2022-05-22 173.0 250642.0 16607.419 5.0 5494.0 364.030 23058.0 4474541.0 77.38 41.23
188787 ZWE Zimbabwe Africa 2022-05-23 60.0 250702.0 16611.394 1.0 5495.0 364.096 NaN NaN NaN NaN

188788 rows × 14 columns

In [5]:
#remove continents
covid19_df = covid19_df[~(covid19_df.location.isin(['Asia', 'Europe', 'European Union', 'North America', 'South America', 'Africa', 'Upper middle income', 'World' ]))].reset_index()

Step 2: Application

Step 2.1: Feature 1 Display the current number of total deaths, total cases and people that are fully vaccinated in the world.

This is a static (i.e. non-interactive) feature that simply states the accumulated number of deaths, cases and people that are fully vaccinated in the world up to the current date.

In [6]:
# compute total deaths
total_deaths = covid19_df['total_deaths'].sum().astype(int)
total_cases = covid19_df['total_cases'].sum().astype(int)
people_fully_vaccinated = covid19_df['people_fully_vaccinated'].sum().astype(int)
#format with comma
total_deaths = "{:,}".format(total_deaths)
total_cases = "{:,}".format(total_cases)
people_fully_vaccinated = "{:,}".format(people_fully_vaccinated)
In [7]:
card1 = dbc.Card( #create a first card component
    children = [  
        html.P('Static feature states the accumulated number of COVID 19', className = 'card-text'),

        # 1st row with number
        dbc.Row([
            dbc.Col(html.H2(total_deaths), width = 4),
            dbc.Col(html.H2(total_cases), width = 4),
            dbc.Col(html.H2(people_fully_vaccinated), width = 4)]),
        # 2nd row with label
        dbc.Row([
            dbc.Col(html.H6('Total deaths'), width = 4),
            dbc.Col(html.H6('Total cases'), width = 4),
            dbc.Col(html.H6('People fully vaccinated'), width = 4)])
    ], 
    body = True 
)
Step 2.2: Feature 2 Display a world map that shows the number of deaths, cases and vaccinations for each country in the data set.

This should be an interactive feature that depends on two selectors: The first selector is the variable, and the user should be able to select between the following variables: Deaths, Cases, Vaccinations

The second selector is the metric, which should depend on the chosen variable. If the user has chosen “Deaths” or “Cases,then the available metrics should be: Total, Total per 1 million population, Newley reported in the last 24 hours

However, if the user has selected “Vaccinations”, then the available metrics should be: Total doses administered per 100 population, Persons vaccinated with at least one dose per 100 population, Persons fully vaccinated with last dose of primary series

In [8]:
# create the selector metric
death_options = [
        {'label' : 'Total', 'value' : 'total_deaths'},
        {'label' : 'Total per 1 million population', 'value' : 'total_deaths_per_million'},
        {'label' : 'Newley reported in the last 24 hours', 'value' : 'new_deaths'}
]
case_options= [
        {'label' : 'Total', 'value' : 'total_cases'},
        {'label' : 'Total per 1 million population', 'value' : 'total_cases_per_million'},
        {'label' : 'Newley reported in the last 24 hours', 'value' : 'new_cases'}]

vaccine_options = [
        {'label' : 'Total doses administered per 100 population', 'value' : 'total_vaccinations_per_hundred'},
        {'label' : 'Persons vaccinated with at least one dose per 100 population', 'value' : 'people_vaccinated_per_hundred'},
        {'label' : 'Persons fully vaccinated with last dose of primary series', 'value' : 'people_fully_vaccinated'}
]

choices = {
        "total_deaths": death_options,
        "total_cases": case_options,
        "people_fully_vaccinated": vaccine_options,
}

variable_dropdown1 = dcc.Dropdown(
    id = 'my_variable1',
    options = [
        {'label' : 'Deaths', 'value' : 'total_deaths'},
        {'label' : 'Cases', 'value' : 'total_cases'},
        {'label' : 'Vaccinations', 'value' : 'people_fully_vaccinated'}],
    value = 'total_deaths'
)

variable_dropdown2 = dcc.Dropdown(
    id = 'my_variable2',
    options = choices['total_deaths'],
    value = 'total_deaths'
)

variable_dropdown3 = dcc.Dropdown(
    id = 'my_variable3',
    options = [
        {'label' : 'Deaths', 'value' : 'total_deaths'},
        {'label' : 'Cases', 'value' : 'total_cases'},
        {'label' : 'Vaccinations', 'value' : 'people_fully_vaccinated'}],
    value = 'total_deaths'
)
In [9]:
# sum each variable in each country
covid19_df2 = covid19_df.groupby('iso_code').sum().reset_index()
In [66]:
covid19_df2
Out[66]:
iso_code index new_cases total_cases total_cases_per_million new_deaths total_deaths total_deaths_per_million new_vaccinations people_fully_vaccinated total_vaccinations_per_hundred people_vaccinated_per_hundred
0 ABW 6898403 36316.0 9.674345e+06 9.024997e+07 273.0 77077.0 719035.359 111475.0 2.083860e+07 41658.24 22218.37
1 AFG 335790 179724.0 7.265500e+07 1.823879e+06 7698.0 3158466.0 79287.856 14125.0 1.194489e+08 431.62 366.00
2 AGO 3576705 99287.0 3.009903e+07 8.869975e+05 1903.0 675594.0 19909.311 0.0 1.366561e+08 1198.37 806.71
3 AIA 4163230 3203.0 4.952280e+05 3.274235e+07 10.0 1453.0 96066.107 1421.0 3.282950e+05 5645.89 2961.21
4 ALB 1687140 275881.0 9.256860e+07 3.222093e+07 3497.0 1450026.0 504719.544 1417691.0 1.348863e+08 11161.88 5624.20
... ... ... ... ... ... ... ... ... ... ... ... ...
231 WSM 80620980 12382.0 4.294960e+05 2.145935e+06 25.0 796.0 3977.137 0.0 3.051485e+06 4029.10 2304.42
232 YEM 144589779 11820.0 4.425025e+06 1.451273e+05 2149.0 890350.0 29200.778 0.0 5.025306e+06 42.66 34.49
233 ZAF 131102658 3912948.0 1.446291e+09 2.408799e+07 100629.0 41394345.0 689423.188 17834208.0 2.934570e+09 9613.45 6123.64
234 ZMB 149512418 321195.0 9.938193e+07 5.252562e+06 3985.0 1465996.0 77481.242 1635808.0 1.841189e+08 1587.54 180.75
235 ZWE 149770050 250708.0 6.554053e+07 4.342684e+06 5495.0 1836380.0 121677.659 10431017.0 8.551606e+08 14010.42 8005.38

236 rows × 12 columns

In [11]:
card2 = dbc.Card( #create a second card component
    children = [
        html.H4('Word Map of COVID-19', className = 'card-title'),
        html.P('A world map that shows the number of deaths, cases and vaccinations for each country in the data set.', className = 'card-text'),
        dbc.Row(
            children = [variable_dropdown1, variable_dropdown2]
        ),
        
        dcc.Graph(id = 'my_map')
    ], 
    body = True
)
Step 2.3: Feature 3 Display a bar plot that shows the global number of new deaths, cases and vaccinations for each week in the data set.

This feature should also be an interactive feature, but it should only depend on the chosen variable. When the user selects one of the available variables (“Deaths”, “Cases” and “Vaccinations”), the bar plot should display the global number of new occurrences for that variable in each week of the data set.

In [12]:
#sum each variable in each week
covid19_df['date'] = pd.to_datetime(covid19_df['date'])
covid19_df3 = covid19_df.resample('W', on = 'date').sum().reset_index()
covid19_df3
Out[12]:
date index new_cases total_cases total_cases_per_million new_deaths total_deaths total_deaths_per_million new_vaccinations people_fully_vaccinated total_vaccinations_per_hundred people_vaccinated_per_hundred
0 2020-01-05 1246010 0.0 0.000000e+00 0.000000e+00 0.0 0.0 0.000 0.0 0.000000e+00 0.00 0.00
1 2020-01-12 2908423 0.0 0.000000e+00 0.000000e+00 0.0 0.0 0.000 0.0 0.000000e+00 0.00 0.00
2 2020-01-19 3815286 0.0 0.000000e+00 0.000000e+00 0.0 0.0 0.000 0.0 0.000000e+00 0.00 0.00
3 2020-01-26 13059186 1607.0 5.836000e+03 2.768400e+01 39.0 159.0 0.111 0.0 0.000000e+00 0.00 0.00
4 2020-02-02 28076774 14780.0 6.242000e+04 1.545060e+02 311.0 1362.0 1.036 0.0 0.000000e+00 0.00 0.00
... ... ... ... ... ... ... ... ... ... ... ... ...
160 2023-01-29 208635546 2882895.0 8.464243e+09 2.908763e+08 29558.0 77532856.0 1886649.940 16696911.0 3.749140e+10 59466.51 22299.82
161 2023-02-05 207816539 2592794.0 8.483389e+09 2.912665e+08 24966.0 77724002.0 1888889.574 16459975.0 3.720146e+10 58679.39 21598.84
162 2023-02-12 207406854 2227846.0 8.499874e+09 2.916543e+08 16307.0 77868701.0 1891498.278 9688986.0 3.647611e+10 55254.19 20354.06
163 2023-02-19 206365852 2113016.0 8.514915e+09 2.921741e+08 15645.0 77976940.0 1893373.585 9187455.0 3.632335e+10 55093.87 20745.62
164 2023-02-26 174992291 1707377.0 7.309600e+09 2.507471e+08 11175.0 66914311.0 1624168.899 1233017.0 2.813363e+10 29671.00 11170.78

165 rows × 12 columns

In [13]:
card3 = dbc.Card( # create a third card component
        children = [
        html.H4('Bar Plot', className = 'card-title'),
        html.P('A bar plot that shows the global number of new deaths, cases and vaccinations for each week in the data set.', className = 'card-text'),
        dbc.Row(
            dbc.Col([html.Label('Select variable:'),variable_dropdown3], width = 6)
            ),
        dcc.Graph(id = 'my_barplot')
    ], 
    body = True
)
Step 2.4: Deploy the app
In [14]:
app = JupyterDash(external_stylesheets = [dbc.themes.BOOTSTRAP, dbc_css])

app.layout = dbc.Container(
    children = [
        
        # header
        html.H1('COVID-19 Dashboard'),
        dcc.Markdown(
            """Data on daily observations on COVID-19 are extracted from the 
               [Our World in Data](https://covid.ourworldindata.org/data/owid-covid-data.csv) 
               database."""
        ),
        # insert cards
        card1,
        html.Br(),
        card2,
        html.Br(),
        card3,
        html.Br(),
        ],
    className = 'dbc'
)
@app.callback(
        Output('my_variable2', component_property="options"),
        Output('my_variable2', component_property="value"),
        Input('my_variable1', component_property="value")

        )
def update_dropdown(choice):
    return choices[choice], choices[choice][0]['value']


@app.callback(
    Output('my_map', 'figure'),
    Input('my_variable2', 'value'),

)
def update_map(metric):
    dff = covid19_df2
    map=px.choropleth(
            dff,
            locations = dff['iso_code'],
            color = metric,                  
        color_continuous_scale = 'blues',    
        hover_name = 'iso_code',              
        hover_data = {'iso_code' : False},
    )
    map.update_layout(
        coloraxis_colorbar_title = None,
        geo_showframe = False,
        margin = {'l' : 0, 'r' : 0, 'b' : 0, 't' : 0}
    )
    return(map)

@app.callback(
    Output('my_barplot', 'figure'),
    Input('my_variable3', 'value')
)
def update_bar_plot(variable):
    dff = covid19_df3
    barchart=px.bar(
            data_frame=dff,
            x=dff['date'],
            y=dff[variable])   
    return(barchart)

app.run_server(port = 3420)
Dash app running on http://127.0.0.1:3420/
In [ ]: