2019 CyberGIS AAG-UCGIS Summer School - Group 5 ¶

A Reproducible and Replicable Spatially Explicit Agent-Based Model Using CyberGIS-Jupyter: A Case Study in Queen Anne Neighborhood, Seattle, WA ¶

Members: Alexander Michels^a, Angie Werner^b, Brandon Lieberthal^c, Changzhen Wang^d, Mishka Vance^e, Li Yi^f Mentor: Jeon-Young Kang^g

<sup font-size: smaller> ^a Mathematics and computational analysis, University of Illinois Urbana-Champaign
^b Environmental health and epidemiology, National Environmental Public Health Tracking Program, Centers for Disease Control and Prevention
^c Spatial analysis and geovisualization, University of Maine
^d Geography, Louisiana State University
^e Human geography, Hunter College
^f Spatial analysis and modeling, geovisualization, University of Southern California
^g Spatially explicit agent-based modeling, University of Illinois Urbana-Champaign
</sup>

Introduction¶

Influenza is a contagious respiratory illness caused by influenza viruses. The flu is spread from person to person, often by droplets when those who are infected cough, sneeze, or talk. Sometimes people can be infected by touching a surface that has the virus on it, although this is not as common. Symptoms include fever, cough, sore throat, muscle aches, headaches, and fatigue. Influenza has a high burden of disease despite vaccination and public health policies that have been implemented. The CDC estimates that more than 900,000 people were hospitalized and over 80,000 people died from the flu in the 2017-2018 season (CDC, 2018).

Spatial simulation modeling (also known as agent-based modeling or ABM) serves as a tool to simulate and capture various spatiotemporal phenomena. ABM can be used to explore the epidemiology and spread of certain diseases such as influenza. The phenomena being modeled are the results from dynamic interactions and behaviors of individual agents and the environments. The modeling process, however, is not always simple—it requires the conceptualization and understanding of the spatiotemporal process at an individual level. An ABM has several key components, including its agents, properties, environment, and rules, which specify how the agents interact and what happens when they interact. This work will apply a spatially explicit ABM to simulate influenza spread.

Research Question¶

To what degree can a spatially explicit agent-based model simulate influenza spread in the Queen Anne neighborhood in Seattle, Washington?

Main Objectives¶

To conceptualize a spatiotemporal phenomenon: influenza transmission
To capture and simulate the influenza transmission process
To explore the spatiotemporal patterns of influenza outbreaks

Workflow¶

Table of Contents¶

Load (or Find) Your Data
Synthesize a Population
Simulation
Visualization
Contributions to Reproducibility and Replicability
Future Work
References

1. Find the Data¶

1.1 Geographic data

The geographic data of households/workplaces/schools were obtained from the City of Seattle Open Data Hub. Then, the data were extracted based on the zoning types that overlay the building footprints. The building footprint and zoning data can be downloaded by clicking the respective links.

1.2 Neighborhood selection

For this workshop, we have selected Queen Anne neighborhood in Seattle as a use case example, and datasets were subsetted by choosing x and y limits (in UTM 10N coordinate system) of the Queen Anne neighborhood.

xmin = 546302; xmax = 549610; ymin = 5274394; ymax = 5277978;
nhouses_old = len(houses); houses = houses[(houses.X >= xmin) & (houses.X <= xmax) & (houses.Y >= ymin) & (houses.Y <= ymax)]; nhouses_new = len(houses); print(nhouses_new," of ",nhouses_old," houses were selected.")

1.3 Human agents

Interactions between the human agents and their environments (attributes and methods) are included in the Human.py and Env.py programs. The datasets needed for this include population distribution and household size. All of these datasets are at the block group level (specific to Queen Anne neighborhood) and were obtained from the American Community Survey for 2013-2017 5-year estimates.

Population distribution data - Table B01001 - Sex by Age

Household size data - Table B11016 - Household Type by Household Size

These data were joined to the Queen Anne neighborhood block groups in order to calculate the weights for population distribution and household size across the entire neighborhood. Age groups were applied using standard 5-year age groups across 18 categories (e.g., starting with 0-4 and ending with 85+). Household size used only family households (i.e., excluded non-family households) and used six categories (size 2, size 3, size 4, size 5, and size 6 or more).

1.4 Influenza data

Influenza data were obtained from the 2018-2019 Weekly Flu Surveillance Reports from the Seattle & King County Department of Public Health. The starting week used was October 7 - 13 2018 and the ending week used was April 28 - May 4 2019. All weekly data were used in this time period to sum the total influenza cases (e.g., summing influenza A cases, influenza B cases, etc.) over time. These data are used to generate a plot of reference data against which to compare the results of the simulation.

Influenza data

1.5 Additional data

Race and median household income data were also obtained from the American Community Survey (2013-2017 5-year estimates) for the Queen Anne neighborhood block groups. This information was used for visual analysis to examine the distribution of influenza cases in relation to different sociodemographic characteristics.

Race data - Table B02001 - Race

Median household income data - Table B19013 - Median household income in the past 12 months

2. Synthesize a Population¶

2.1 The locations of households, schools, and workplaces are obtained by overlaying building footprints data with zoning data (e.g., residential, commercial, educational).
2.2 Age distributions and the composition of households are synthesized as similar to American Community Survey (ACS) 2013-17 5-year estimates data (see Find the Data Section for more information).
2.3 Individual humans have age and age-dependent daily behaviors (e.g., commuting to schools (aged 6 to 19) and workplaces (aged 20 to 64), and staying at home all day (the rest of people)).

3. Simulate a Flu Season¶

3.1 A random sample of people get infected at the beginning of the simulation.
3.2 The disease process is explained through a susceptible-exposed-infectious-recovered (SEIR) model.
3.3 An individual human can spread influenza to his/her classmates/co-workers and household members.

4. Analyze and Visualize¶

4.1 The model output will be saved as 'results.csv' file.
4.2 The file consists of three columns, including 'time', 'x-coordinate', and 'y-coordinate' of influenza outbreaks.

Codes¶

Import necessary libraries to run this model.

!pip install --user folium

Requirement already satisfied: folium in /home/jovyan/.local/lib/python3.7/site-packages (0.10.1)
Requirement already satisfied: jinja2>=2.9 in /opt/conda/lib/python3.7/site-packages (from folium) (2.11.1)
Requirement already satisfied: numpy in /opt/conda/lib/python3.7/site-packages (from folium) (1.18.1)
Requirement already satisfied: requests in /opt/conda/lib/python3.7/site-packages (from folium) (2.23.0)
Requirement already satisfied: branca>=0.3.0 in /opt/conda/lib/python3.7/site-packages (from folium) (0.3.1)
Requirement already satisfied: MarkupSafe>=0.23 in /opt/conda/lib/python3.7/site-packages (from jinja2>=2.9->folium) (1.1.1)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.7/site-packages (from requests->folium) (2019.11.28)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/lib/python3.7/site-packages (from requests->folium) (1.25.7)
Requirement already satisfied: idna<3,>=2.5 in /opt/conda/lib/python3.7/site-packages (from requests->folium) (2.9)
Requirement already satisfied: chardet<4,>=3.0.2 in /opt/conda/lib/python3.7/site-packages (from requests->folium) (3.0.4)
Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from branca>=0.3.0->folium) (1.14.0)

import numpy as np
import folium, itertools, math, pickle, random, time
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.pylab as pylab
import matplotlib.image as mpimg
import matplotlib.animation as animation
from os import path

from pyproj import Proj, transform

import os
os.environ["PROJ_LIB"] = '/opt/conda/envs/python2/share/proj'

import Env
import Human

Import python codes containing the classes and functions for environment and human behaviors.

Load (or Find) Your Data¶

Files of households, workplaces, and schools for the Queen Anne neighborhood are imported.

Choose x and y limits (in UTM) and filter houses to select those in the Queen Anne neighborhood.

# Files of households, workplaces, and schools are imported.
CITY = "QueenAnne" # name of the city you're loading data for
houses = pd.read_csv('./data/{}/houses_points.csv'.format(CITY))
schools = pd.read_csv('./data/{}/schools_points.csv'.format(CITY))
works = pd.read_csv('./data/{}/works_points.csv'.format(CITY))

# Choose x and y limits (in UTM) and filter houses
xmin = 546302;
xmax = 549610;
ymin = 5274394;
ymax = 5277978;
nhouses_old = len(houses);
houses = houses[(houses.X >= xmin) & (houses.X <= xmax) & (houses.Y >= ymin) & (houses.Y <= ymax)];
nhouses_new = len(houses);
print(nhouses_new," of ",nhouses_old," houses filtered.")

16755  of  391992  houses filtered.

Inspect first five rows of the house, school, and workplace datasets, then create figures to plot the house, school, and workplace datasets.

house_cat = houses.DESCR.unique().tolist() # Obtain the type of households
house_dict = {}
rd_color =["#"+''.join([random.choice('0123456789ABCDEF') for j in range(6)]) 
                 for i in range(len(house_cat))]  # generated random color based on the types of households
for i in range(len(house_cat)):    
    houses.loc[houses['DESCR'] == house_cat[i], 'color'] = rd_color[i]   # set value for color column
    house_dict[house_cat[i]] = rd_color[i]
housemyProj = Proj("+proj=utm +zone=10N, +north +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
houselonlat = housemyProj(houses['X'].values.tolist(), houses['Y'].values.tolist(), inverse=True)
houses['lat'] = houselonlat[1]
houses['lon'] = houselonlat[0]
houses.head()

school_cat = schools.CODE.unique().tolist() # Obtain the type of schools
school_dict = {}
rd_color =["#"+''.join([random.choice('0123456789ABCDEF') for j in range(6)]) 
                 for i in range(len(school_cat))]   # generate random color based on the types of schools
for i in range(len(school_cat)):    
    schools.loc[schools['CODE'] == school_cat[i], 'color'] = rd_color[i]   # set value for color column
    school_dict[school_cat[i]] = rd_color[i]
    
schoolmyProj = Proj("+proj=utm +zone=10N, +north +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
schoollonlat = schoolmyProj(schools['X'].values.tolist(), schools['Y'].values.tolist(), inverse=True)
schools['lat'] = schoollonlat[1]
schools['lon'] = schoollonlat[0]
schools.head()

work_cat = works.DESCR.unique().tolist() # obtain the type of workplaces
work_dict = {}
rd_color =["#"+''.join([random.choice('0123456789ABCDEF') for j in range(6)])
                 for i in range(len(work_cat))]  # generate random color based on the types of workplaces
for i in range(len(work_cat)):    
    works.loc[works['DESCR'] == work_cat[i], 'color'] = rd_color[i]   # set value for color column
    work_dict[work_cat[i]] = rd_color[i]
    
workmyProj = Proj("+proj=utm +zone=10N, +north +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
worklonlat = workmyProj(works['X'].values.tolist(), works['Y'].values.tolist(), inverse=True)
works['lat'] = worklonlat[1]
works['lon'] = worklonlat[0]
works.head()

# count the number by house types
for key,value in house_dict.items():
    sub_houses = houses[houses.DESCR == key]
    print("The number of", key, "households is ",len(sub_houses)) # counting the number of houeholds

print("The number of households in total:", len(houses)) # counting the number of households

The number of Multi-Family households is  5282
The number of Single Family households is  11473
The number of households in total: 16755

Codes to produce the interactive map for houses and save the map as a .html¶

m = folium.Map(location=[47.638972, -122.361579], tiles='cartodbpositron', zoom_start=14) for i in range(0, len(houses)): folium.CircleMarker( location=[houses.iloc[i]['lat'], houses.iloc[i]['lon']], popup=houses.iloc[i]['DESCR'], radius=2, color='grey', weight=0.1, fill=True, fill_color=houses.iloc[i]['color'], fill_opacity=0.6, legend_name = 'Number of incidents per district' ).add_to(m) legend_html = '''
Legend
''' color_leg = "" for key, value in house_dict.items(): color_leg += ' ' +' ' color_leg += key + '
' legend_html += color_leg + '
''' m.get_root().html.add_child(folium.Element(legend_html)) m.save('maps/houses_interactive_map.html')

# display the interactive map for houses

from IPython.display import IFrame

IFrame(src='maps/houses_interactive_map.html', width=900, height=700)

# count the number by school types
for key,value in school_dict.items():
    sub_schools = schools[schools.CODE == key]
    print("The number of", key, "schools is ",len(sub_schools)) # counting the number of schools

print("The number of schools in total:", len(schools)) # counting the number of schools

The number of College or University schools is  13
The number of Elementary schools is  103
The number of High schools is  22
The number of Alternative schools is  24
The number of Junior High/Middle schools is  18
The number of Other Facility schools is  2
The number of K-12 schools is  1
The number of schools in total: 183

Codes to produce the interactive map for schools and save the map as a .html¶

m = folium.Map(location=[47.638972, -122.361579], tiles='cartodbpositron', zoom_start=14) for i in range(0, len(schools)): folium.CircleMarker( location=[schools.iloc[i]['lat'], schools.iloc[i]['lon']], popup=schools.iloc[i]['NAME'], radius=5, color='grey', weight=0.1, fill=True, fill_color=schools.iloc[i]['color'], fill_opacity=0.6, legend_name = 'Number of incidents per district' ).add_to(m)
legend_html = '''
Legend
''' color_leg = "" for key, value in school_dict.items(): color_leg += ' ' +' ' color_leg += key + '
' legend_html += color_leg + '
''' m.get_root().html.add_child(folium.Element(legend_html)) m.save('maps/schools_interactive_map.html')

# display the interactive map
from IPython.display import IFrame

IFrame(src='maps/schools_interactive_map.html', width=900, height=700)

# count the number by workplace types
for key, value in work_dict.items():
    sub_works = works[works.DESCR == key]
    print("The number of", key, "workplace is ",len(sub_works)) # counting the number of schools
print("The number of workplace in total:", len(works)) # counting the number of schools

The number of Commercial/Mixed Use workplace is  15040
The number of Manufacturing/Industrial workplace is  6066
The number of Downtown workplace is  1672
The number of workplace in total: 22778

Codes to produce the interactive map for workplaces and save the map as a .html¶

m = folium.Map(location=[47.638972, -122.361579], tiles='cartodbpositron', zoom_start=14) for i in range(0, len(works)): folium.CircleMarker( location=[works.iloc[i]['lat'], works.iloc[i]['lon']], popup=works.iloc[i]['DESCR'], radius=2, color='grey', weight=0.1, fill=True, fill_color=works.iloc[i]['color'], fill_opacity=0.6 ).add_to(m) legend_html = '''
Legend
''' color_leg = "" for key, value in work_dict.items(): color_leg += ' ' +' ' color_leg += key + '
' legend_html += color_leg + '
''' m.get_root().html.add_child(folium.Element(legend_html)) m.save('maps/works_interactive_map.html')

# display the interactive map
from IPython.display import IFrame
IFrame(src='maps/works_interactive_map.html', width=900, height=700)

Before running the simulation, the environments (schools, workplaces, households) and individual human agents needs to be set up.
The functions of settingSchools (), settingWorks(), and settingHouseholds() are included in Env.py.
The function of settingHumanAgent() is included in Human.py.

Synthesize a Population¶

try: # try to open the preprocessed data
    with open("data/{}/schoolList.pickle".format(CITY), "rb") as f:
        schoolList = pickle.load(f)
    print("...loaded school data from pickle...")
except: # if data does not exist, write it
    schoolList = Env.settingSchools('./data/{}'.format(CITY))
    with open("data/{}/schoolList.pickle".format(CITY), "wb") as f:
        pickle.dump(schoolList, f)
try:
    with open("data/{}/workList.pickle".format(CITY), "rb") as f:
        workList = pickle.load(f)
    print("...loaded work list from pickle...")
except:
    workList = Env.settingWorks('./data/{}'.format(CITY))
    with open("data/{}/workList.pickle".format(CITY), "wb") as f:
        pickle.dump(workList, f)
try:
    with open("data/{}/houseList.pickle".format(CITY), "rb") as f:
        houseList = pickle.load(f)
    print("...loaded house list from pickle...")
except:
    houseList = Env.settingHouseholds('./data/{}'.format(CITY), schoolList, workList, houses = houses)
    with open("data/{}/houseList.pickle".format(CITY), "wb") as f:
        pickle.dump(houseList, f)
try:
    with open("data/{}/peopleList.pickle".format(CITY), "rb") as f:
        peopleList = pickle.load(f)
    print("...loaded people list from pickle...")
except:
    peopleList = Human.settingHumanAgent(houseList)
    with open("data/{}/peopleList.pickle".format(CITY), "wb") as f:
        pickle.dump(peopleList, f)

...loaded school data from pickle...
...loaded work list from pickle...
...loaded house list from pickle...
...loaded people list from pickle...

Inspect the age distribution of the human agents.

# make a histogram for age distributions of human agents
age = [peopleList[i].age for i in range(0,len(peopleList))] # make a list for people's age
plt.figure(figsize=(10,10))
plt.hist(age, bins=20, color='orange', edgecolor='black', linewidth=1.2)
plt.xlabel("Age", size=15)
plt.ylabel("Number of people", size=15)
print("Total population:", len(peopleList))

Total population: 33648

Visualizing the Contact Network¶

We are working with a very large simulated data set that networkx is incapable of visualizing, but it is able to visualize subsets (or rather subgraphs)! We have provided two ways to subset your graphs, randomly subsampling by person ID and by taking all people in a geographic subset of our study area.

To help ease the computational strain on networkx, we are using something called a line graph rather than a graph. A line graph of an undirected graph G, is a graph that represents the adjacencies between the edges of G. In our case, this means instead of the nodes being people and the edges being places they visit, the nodes are places and the edges are people that visit both places. Because of our model assumptions that people are tied to their houses, schools, and places of work, a line graph reduces the graph while preserving all connectivity information.

This is the line graph for Queen Anne by randomly selecting every 10th person:

Graph here

This is the line graph for Queen Anne by taking a geographic subset (people within the bounding box x in (547,956 - 548,783) and y in (5,277,186 - 5,277,082))

Graph here

Initial settings of the ABM model¶

introRate: the number of people who are randomized as exposed at the beginning of the simulation.
reproduction: the expected average number of secondary infection cases taking place per a single infection case.
infRate: the probability that an individual human becomes infectious when he/she contacts an infectious person.
days: the duration for the simulation to be run.

Attention!!! Now, this model is assumed to be run for 30 days, but to be fully completed, the days parameter should be 155.

import ipywidgets as widgets
introRateW = widgets.IntSlider(min=0,max=10,value=5, description="introRate")
reproW = widgets.FloatSlider(min=0, max=5, step=0.1,value=2.4, description="reproduction")
infW = widgets.FloatSlider(min=0, max=1, step=.01, value=0.18, description="infRate")
daysW = widgets.IntSlider(min=0, max=155, value = 30, description="days")
display(introRateW, reproW, infW, daysW)

introRate = float(introRateW.value)/10000.0
reproduction = reproW.value
infRate = infW.value
days = daysW.value # the list for counting infectious people over simulation days.

#initial infections based on the introRate
peopleList = Env.initialInfection(peopleList, introRate)

Processes of the ABM model¶

for loop was used to set up simulation perods.
During each day, susceptible humans have a chance to be exposed to Influenza by infectious humans.
The exposed humans were assumed to become infectious after an incubation period (3 days).
The infected humans were assumed to recover after an infectious period (7 days).

from tqdm.notebook import tqdm
total_simulation_time=0
infectiousCount=[]
exposedX = []
exposedY = []
exposedT = []

for t in tqdm(range(1,days), desc="Time Steps"):
    start = time.time()
    susceptibleList=[]
    exposedList=[]
    infectiousList=[]
    recoveredList=[]
    for p in tqdm(range(0,len(peopleList)), desc="Update States", leave=False):
        person = peopleList[p]
        person.incubating()
        person.recovering()
       
        if (person.S==True):
            susceptibleList.append(person)
        if (person.E==True):
            exposedList.append(person)
        if (person.I==True):
            infectiousList.append(person)
        if (person.R==True):
            recoveredList.append(person)
            if (person.recT is None):
                person.recT = t;
        if (person.infT==t):
            exposedX.append(person.infX)
            exposedY.append(person.infY)
            exposedT.append(person.infT)
            
    infectiousCount.append(len(infectiousList))
    
    for p in tqdm(peopleList, desc="Infection", leave=False):
        p.infecting(peopleList, infRate, reproduction, t) #infecting function is included in Human.py
        
    iteration_time = time.time()-start
    total_simulation_time+=iteration_time
    print("Simulation Day: {:3d},  Remaining Days: {:3d},  # of infectious people: {:5d}  took  {:4.2f} seconds".format(t, days-t,len(exposedList),iteration_time))
    

total_simulation_time/=60.0 # get minutes
hours = False
daysBool = False
if total_simulation_time >= 60:
    total_simulation_time/=60.0 # get hours
    hours = True
    if total_simulation_time >= 24:
        total_simulation_time/=24.0
        daysBool = True
        
if not (hours and days):
    print("\n\n Total Simulation Time: {:.1f} minutes.".format(total_simulation_time))
elif not daysBool:
    print("\n\n Total Simulation Time: {:.1f} hours.".format(total_simulation_time))
else:
    print("\n\n Total Simulation Time: {:.1f} days.".format(total_simulation_time))

Simulation Day:   1,  Remaining Days:  29,  # of infectious people:     8  took  0.24 seconds

Simulation Day:   2,  Remaining Days:  28,  # of infectious people:     9  took  0.26 seconds

Simulation Day:   3,  Remaining Days:  27,  # of infectious people:     4  took  0.27 seconds

Simulation Day:   4,  Remaining Days:  26,  # of infectious people:     6  took  0.34 seconds

Simulation Day:   5,  Remaining Days:  25,  # of infectious people:    11  took  0.27 seconds

Simulation Day:   6,  Remaining Days:  24,  # of infectious people:    10  took  0.31 seconds

Simulation Day:   7,  Remaining Days:  23,  # of infectious people:     6  took  0.37 seconds

Simulation Day:   8,  Remaining Days:  22,  # of infectious people:     8  took  0.27 seconds

Simulation Day:   9,  Remaining Days:  21,  # of infectious people:     7  took  0.36 seconds

Simulation Day:  10,  Remaining Days:  20,  # of infectious people:     8  took  0.39 seconds

This saves the simulation results to a .csv file to test for stochasticity later. Each time the simulation is run, every function starting with "Synthesize a Population" must be rerun.

Visualization¶

## Visualization of the simulated flu cases
realdata = pd.read_csv('./data/{}/fludata.csv'.format(CITY))
plt.figure(figsize=(14,7))
plt.plot(realdata['Date'], realdata['Positive'])
infCases=pd.DataFrame({'time':range(1,days),'infectious':infectiousCount})
plt.figure(figsize=(14,7))
plt.plot(infCases['time'],infCases['infectious'])
plt.title('The number of infectious people over day of year', size=20)
plt.xlabel("Days", size=15)
plt.ylabel("Number of people", size=15)

Text(0, 0.5, 'Number of people')

## Visualization of actual flu data from the Queen Anne neighbhorhood in Seattle
realdata = pd.read_csv('./data/{}/fludata.csv'.format(CITY))
plt.figure(figsize=(14,7))
plt.plot(realdata['Date'], realdata['Positive'])
plt.title('The number of infectious people over day of year', size=20)
plt.xlabel("Date", size=15)
plt.ylabel("Number of infectious people", size=15)
import matplotlib.ticker as ticker
ax = plt.axes()
ax.xaxis.set_major_locator(ticker.MultipleLocator(5))

/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:9: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance.  In a future version, a new instance will always be created and returned.  Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
  if __name__ == '__main__':

# Make the list of information of influenza outbreaks.
houseX=[peopleList[i].houseX for i in range(len(peopleList)) if peopleList[i].infX != None]
houseY=[peopleList[i].houseY for i in range(len(peopleList)) if peopleList[i].infY != None]

exposedX=[peopleList[i].infX for i in range(len(peopleList)) if peopleList[i].infX != None]
exposedY=[peopleList[i].infY for i in range(len(peopleList)) if peopleList[i].infY != None]
exposedT=[peopleList[i].infT for i in range(len(peopleList)) if peopleList[i].infX != None]

recoveredX=[peopleList[i].houseX for i in range(len(peopleList)) if peopleList[i].recT != None]
recoveredY=[peopleList[i].houseY for i in range(len(peopleList)) if peopleList[i].recT != None]
recoveredT=[peopleList[i].recT for i in range(len(peopleList)) if peopleList[i].recT != None]

Output file will be saved as results.csv

results = pd.DataFrame({'time':exposedT, 'xCoord':exposedX, 'yCoord':exposedY})
results.to_csv('data/{}/results.csv'.format(CITY))

results = pd.read_csv('data/{}/results.csv'.format(CITY))

Make a video of infection and recovery cases.¶

xmin = 546302;
xmax = 549610;
ymin = 5274394;
ymax = 5277978;

for t in range(days):
    fig = plt.figure() # size of figure
    
    ilistX = [houseX[i] for i in range(len(exposedX)) if exposedT[i] <= t]
    ilistY = [houseY[i] for i in range(len(exposedY)) if exposedT[i] <= t]
    rlistX = [recoveredX[i] for i in range(len(recoveredX)) if recoveredT[i] <= t]
    rlistY = [recoveredY[i] for i in range(len(recoveredY)) if recoveredT[i] <= t]
    plt.scatter(houses['X'],houses['Y'], s = 0.2, c = '0.8') # drawing scatter plots of households' locations
    plt.scatter(ilistX,ilistY, s = 10, c = 'r', label ='Infectious cases')
    plt.scatter(rlistX,rlistY, s = 10, c = 'b', label ='Recovered cases')
    plt.xlabel("UTM X-Coordinates", size=15)
    plt.ylabel("UTM Y-Coordinates", size=15)
    plt.xlim(xmin,xmax)
    plt.ylim(ymin,ymax)
    plt.title('Time = '+str(t)+' days, I(t) = '+str(len(ilistX)-len(rlistX)), size = 20)
    plt.axes().set_aspect('equal', 'datalim')
    plt.legend(markerscale = 2.0)
    plt.tight_layout()    
    
    fig.savefig('img/Figures/figure%d.png' % t, dpi=300)
    pylab.close(fig);

/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:21: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance.  In a future version, a new instance will always be created and returned.  Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.

frames = []
    
fig = plt.figure(figsize=(15,15))
plt.xticks([])
plt.yticks([])
plt.tight_layout()    
for t in range(days):
    img = mpimg.imread('img/Figures/figure%d.png' % t) 
    frames.append([plt.imshow(img, animated=True)])

pylab.close(fig);
ani = animation.ArtistAnimation(fig, frames, interval=500, blit=False)
ani.save('img/movie.mp4')

import io
import base64
from IPython.display import HTML

video = io.open("img/movie.mp4", 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video width="960p" height="720p" alt="test" controls>
                <source src="data:video/mp4;base64,{0}" type="video/mp4" />
             </video>'''.format(encoded.decode('ascii')))

import geopandas as gpd
county_geo = gpd.read_file('./data/shapefiles/qanne.geojson')
county_geo_data = county_geo.to_crs(epsg='4326').to_json()
result_data = pd.read_csv("./data/QueenAnne/results.csv")

myProj = Proj("+proj=utm +zone=10N, +north +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
lonlat = myProj(result_data['xCoord'].values.tolist(), result_data['yCoord'].values.tolist(), inverse=True)
result_data['lat'] = lonlat[1]
result_data['lon'] = lonlat[0]
from folium.plugins import HeatMap
heat_data = [[row['lat'],row['lon']] for index, row in result_data.iterrows()]

heat_data_2 = [[[row['lat'],row['lon']] for index, 
              row in result_data[result_data['time'] == i].iterrows()] for i in range(0,155)]

from folium import plugins
m = folium.Map(location=[47.648972, -122.361579], tiles = 'cartodbpositron', zoom_start=13)


heatmap = plugins.HeatMapWithTime(heat_data_2,auto_play=True,max_opacity=1,
                                  radius = 20,
                                  name='Influenza Cases Time Series', 
                                  gradient={.6: 'PINK', .95: 'RED', 1: 'CRIMSON'})

heatmap.add_to(m)
m.save('./maps/HeatMap.html')

from IPython.display import IFrame

IFrame(src='./maps/HeatMap.html', width=900, height=1000)

5. Contributions to Reproducibility and Replicability¶

Replicability
- Adapted original model using data obtained for different city with different population structure (different datasets and parameters)
- Improved model through comprehensive documentation, which makes it more reproducible and replicable for future work
Reproducibility
- Made code more easily adaptable to other cities or areas through the use of intuitive parameter sliders and data replacement
- Clear documentation of access, synthesis, analysis, and visualization of open data sources

6. Future Work¶

How can the contact networks be better represented?
- Infection rate as a function of age group (specify more vulnerable age groups) and/or immunity status (based on vaccine coverage data or based on age-specific vaccine coverage)
- Different representations of connections (e.g., hypergraph where any set of nodes can be connected to better represent household)
- Better represent real-world scenarios with certain age groups going to different types of schools, modeling work-from-home populations, college and university students who live and work on campuses, and factor in additional points of contact like daycares, nursing homes, and public spaces
- How do socioeconomic factors such as race/ethnicity or income affect the susceptibility, infectability, and recovery rates of the population?
- Current assumption that agents move from household to nearest workplace or school, so address that using geography of city, public transportation, etc. and gravity models
- Additionally assumes that the population is fixed (no one leaves or enters the population, no deaths or births)
Improve environment parameters of model
- In addition to human agents, model schools/workplaces/households as agents with their own attributes and methods to respond to some condition (e.g., school closures due to influenza outbreak and how that impacts the results)
- Change temporal specifications of the model (e.g., run in 8-hour increments rather than one day, after every 15 units, assume 6 units of leisure for workday versus weekend)
- How can the natural environment be factored into the model? (e.g., atmospheric predictors such as absolute temperature and humidity)
- Are there marginal effects? (i.e., are people living in the periphery infecting other people outside the research area?)
- How does variation in household size impact the results? What about family and non-family household classification?
Do results of the simulation correspond to observation reports?
- Results of the model supported by observed influenza data
- Various ways to validate model that could be explored
- Sensitivity analysis 1: mitigate mismatch between model outcomes and observations through variance-based global sensitivity analysis (GSA) to detect appropriate model specification and parameterization(Kang and Aldstadt, 2019)
- Sensitivity analysis 2: percentage change of infections (such as single parameter, multiple parameters based on deviation and coefficients) (Guo et al, 2015)
- Compare total number of infection cases to empirical data
- Additional validation techniques like uncertainty analysis
- Examine reproducibility of ABM by using statistical analyses to compare results
How can this model be adapted to a broader study context?
- Empirical input data: list of households, schools, workplaces
- Can the algorithm be applied to simulate the disease spread in other eras?
- Can the algorithm be applied to simulate a city with a similar scale?
- Can the algorithm be applied to a more rural area (less densely populated)?
- Potential to look at how the model could be adapted for other diseases (changing incubation period, mode of transmission, etc.)

Explore future publications

7. References¶

Florida Health. (2019). Florida Influenza Surveillance Reports. Accessed July 2019. http://www.floridahealth.gov/diseases-and-conditions/influenza/florida-influenza-surveillance-report-archive/index.html.

Guo, D., Li, K. C., Peters, T. R., Snively, B. M., Poehling, K. A., & Zhou, X. (2015). Multi-scale modeling for the transmission of influenza and the evaluation of interventions toward it. Scientific reports, 5, 8980. Accessed July 2019. https://www.nature.com/articles/srep08980.

Helmholtz. (2016). The ODD Protocol for describing individual-based and agent-based models. Accessed July 2019. https://www.ufz.de/export/data/2/100067_ODD_Update_template.doc.

Kang, J. Y., & Aldstadt, J. (2019). Using multiple scale space-time patterns in variance-based global sensitivity analysis for spatially explicit agent-based models. Computers, Environment and Urban Systems, 75, 170-183.

Lowen, A. C., Mubareka, S., Steel, J., & Palese, P. (2007). Influenza virus transmission is dependent on relative humidity and temperature. PLoS pathogens, 3(10), e151.

Mao, L., & Bian, L. (2010). Spatial–temporal transmission of influenza and its health risks in an urbanized area. Computers, Environment and Urban Systems, 34(3), 204-215.

QUEBS. (2013). The ODD protocol for documenting agent-based models. Accessed July 2019. https://qubeshub.org/community/groups/odd.

Seattle & King County Public Health. (2019). King County Influenza Update: May 4, 2019. Accessed July 2019. https://www.kingcounty.gov/depts/health/communicable-diseases/disease-control/~/media/depts/health/communicable-diseases/documents/influenza/2019/week-18.ashx.

Tracy, Melissa, Magdalena Cerda, and Katherine M. Keyes. (2018). Agent-Based Modeling in Public Health: Current Applications and Future Directions. Annual Review of Public Health 39:77–94. Accessed July 2019. https://doi.org/10.1146/annurev-publhealth040617-014317.

Willem, Lander. (2015). Agent-Based Models For Infectious Disease Transmission: Exploration, Estimation & Computational Efficiency. PhD dissertation. University of Antwerp. Accessed July 2019. https://ibiostat.be/publications/phd/landerwillem.pdf.

	houseID	DESCR	X	Y	Shape_Length	Shape_Area	color	lat	lon
0	14	Multi-Family	546383.0609	5277176.021	39.147069	90.045604	#44B222	47.646319	-122.382414
1	15	Multi-Family	546332.4273	5277159.527	38.950530	84.198371	#44B222	47.646174	-122.383090
2	16	Multi-Family	546357.1243	5277185.342	46.210541	117.874763	#44B222	47.646405	-122.382758
7	21	Multi-Family	546384.6356	5277268.185	72.561298	259.678237	#44B222	47.647148	-122.382383
15	77	Multi-Family	546349.0929	5277102.132	53.187976	151.622510	#44B222	47.645657	-122.382874

	schoolID	CODE	NAME	ABB_NAME	X	Y	color	lat	lon
0	1	College or University	Cornish College of the Arts	Cornish College	549874.0518	5274074.670	#B65EF2	47.618156	-122.336288
1	2	Elementary	St. Paul School	St. Paul	555512.6343	5262151.157	#916732	47.510420	-122.262764
2	3	Elementary	Amazing Grace Christian School	Amazing Grace	555738.5532	5262203.525	#916732	47.510872	-122.259758
3	4	Elementary	St. Anne School	St. Anne	548220.3227	5275499.850	#916732	47.631104	-122.358137
4	5	Elementary	Lowell Elementary School	Lowell	551216.9704	5274783.294	#916732	47.624427	-122.318335

	workID	DESCR	X	Y	Shape_Length	Shape_Area	color	lat	lon
0	1	Commercial/Mixed Use	545239.5425	5279310.598	20.764958	28.055354	#9E56A2	47.665605	-122.397418
1	2	Manufacturing/Industrial	546827.6875	5278198.516	38.236831	80.410245	#5249AE	47.655487	-122.376385
2	3	Commercial/Mixed Use	546775.8362	5277486.733	81.645021	338.953912	#9E56A2	47.649086	-122.377151
3	4	Commercial/Mixed Use	546778.9302	5277470.385	110.783883	499.581222	#9E56A2	47.648939	-122.377112
4	5	Commercial/Mixed Use	546779.4392	5277502.181	91.714040	403.465279	#9E56A2	47.649225	-122.377102