Compositional epidemiological modeling

# Compositional epidemiological modeling <br/> using structured cospans

]

## Micah Halter and Evan Patterson <br/> with James Fairbanks

]

### November 16, 2020

---

# Compositional scientific modeling

* Applied category theory promises to reduce complexity and boost engineering efficiency through *compositionality*
  * Contemporary scientific models can be extremely large and complex, but they are usually built out of simpler pieces
  * **Our goal**: Build ACT software that makes it easier to compose scientific models

---

# Outline

1. C-sets in Catlab
2. Structured cospans of C-sets
3. Structured multicospans and undirected wiring diagrams
4. Petri nets as C-sets
5. COEXIST model of COVID-19

**For background**, watch last week's talk by John Baez on the mathematics of structured cospans and whole-grain Petri nets.

---

# C-sets

**Recall**: Given a small category $\mathcal{C}$, a *$\mathcal{C}$-set* is a functor $\mathcal{C} \to \mathbf{Set}$. <br/> A $\mathcal{C}$-set is also called a "copresheaf on $\mathcal{C}$."

**Example**: The *schema for graphs* is the category $\mathcal{C}$ generated by

![Schema for graphs](schema-graphs.svg# w-20pct)

A $\mathcal{C}$-set is then a *graph*, aka a "directed multigraph."

---

# C-sets in Catlab

In Catlab, a $\mathcal{C}$-set is a functor $\mathcal{C} \to \textbf{FinSet}$.

* Finite sets are identified with $\\{1,\dots,n\\}$ (we work in the skeleton of $\mathbf{FinSet}$)
  * Functions are vectors of integers
  * Preimages ("indexes") are stored for fast lookups, e.g. of edges incident to a vertex

From a software perspective, $\mathcal{C}$-sets are like in-memory databases.

```julia
using Catlab, Catlab.CategoricalAlgebra

@present SchemaGraphs(FreeSchema) begin
  V::Ob
  E::Ob
  src::Hom(E,V)
  tgt::Hom(E,V)
end

const Graph = CSetType(SchemaGraphs,
                       index=[:src, :tgt])

g = Graph()
add_parts!(g, :V, 3)
add_parts!(g, :E, 2, src=[1,2], tgt=[2,3])
g
```

<div class="c-set">
<span class="c-set-summary">CSet with elements V = 1:3, E = 1:2</span>
<table>
  <tr class = "header headerLastRow">
    <th class = rowNumber>E</th>
    <th style = "text-align: right; ">src</th>
    <th style = "text-align: right; ">tgt</th>
  </tr>
  <tr>
    <td class = rowNumber>1</td>
    <td style = "text-align: right; ">1</td>
    <td style = "text-align: right; ">2</td>
  </tr>
  <tr>
    <td class = rowNumber>2</td>
    <td style = "text-align: right; ">2</td>
    <td style = "text-align: right; ">3</td>
  </tr>
</table>
</div>

---

# Morphisms of C-sets

**Recall**: A *morphism of $\mathcal{C}$-sets* $X \to Y$ is a natural transformation $\phi: X \to Y$: a family of functions $\phi_c: X(c) \to Y(c)$, $c \in \mathcal{C}$, satisfying the naturality equations.

Categories of $\mathcal{C}$-sets are very nice: they are "combinatorial toposes."

Implemented in Catlab:

* Morphisms of $\mathcal{C}$-sets
  * (Co)limits of fixed shape: (co)products, pullbacks, pushouts
  * (Co)limits of arbitrary shape

---

# Attributed C-sets

What about data outside of `FinSet{Int}`, say in $\mathbb{R}$?

* Edge weights in graphs
  * Rates in Petri nets

For this, Catlab supports *attributed $\mathcal{C}$-sets*, which can be formalized in several ways:

* As functors $\mathcal{C} \to \mathbf{Set}$, like before, but fixed at certain objects
  * As transformations out of a profunctor $\mathrm{Attr}: \mathcal{C} \to \mathsf{Data}$ (cf. [Schultz et al 2017](http://www.tac.mta.ca/tac/volumes/32/16/32-16abs.html) on "algebraic databases")

```julia
@present SchemaWeightedGraphs <: SchemaGraphs begin
  Real::Data
  weight::Attr(E,Real)
end

const WeightedGraph = ACSetType(
  SchemaWeightedGraphs, index=[:src, :tgt])

g = WeightedGraph{Float64}()
add_parts!(g, :V, 2)
add_parts!(g, :E, 3, src=1, tgt=2,
           weight=[0.25, 0.5, 0.75])
g
```

<div class="c-set">
<span class="c-set-summary">ACSet with elements V = 1:2, E = 1:3</span>
<table>
  <tr class = "header headerLastRow">
    <th class = rowNumber>E</th>
    <th style = "text-align: right; ">src</th>
    <th style = "text-align: right; ">tgt</th>
    <th style = "text-align: right; ">weight</th>
  </tr>
  <tr>
    <td class = rowNumber>1</td>
    <td style = "text-align: right; ">1</td>
    <td style = "text-align: right; ">2</td>
    <td style = "text-align: right; ">0.25</td>
  </tr>
  <tr>
    <td class = rowNumber>2</td>
    <td style = "text-align: right; ">1</td>
    <td style = "text-align: right; ">2</td>
    <td style = "text-align: right; ">0.5</td>
  </tr>
  <tr>
    <td class = rowNumber>3</td>
    <td style = "text-align: right; ">1</td>
    <td style = "text-align: right; ">2</td>
    <td style = "text-align: right; ">0.75</td>
  </tr>
</table>
</div>

---

# Morphisms of attributed C-sets

Usually prefer a restricted notion of *morphism of attributed $\mathcal{C}$-set*: natural transformations $\phi: X \to Y$ such that

$$
\phi_d = 1_D \qquad\text{where}\qquad D := X(d) = Y(d)
$$

for all data objects $d \in \mathcal{C}$.

Similarly to slice categories:

* colimits are the same
  * limits are different but still exist

---

# Structured cospans

Structured cospans model *open systems* ([Baez & Courser 2020](https://arxiv.org/abs/1911.04630)).

**Definition**: Given a functor $L: \mathcal{A} \to \mathcal{X}$, a *structured cospan* is a cospan in $\mathcal{X}$ of form

![Structured cospan](structured-cospan.svg# w-40pct)

where $a,b \in \mathcal{A}$ and $x \in \mathcal{X}$.

---

# Two forms for structured cospans

The functor $L: \mathcal{A} \to \mathcal{X}$ usually has a right adjoint $R: \mathcal{X} \to \mathcal{A}$.

**"$L$-form"**: objects $a,b \in \mathcal{A}$ and cospan in $\mathcal{X}$ of form

![Structured cospan in L-form](structured-cospan-l-form.svg# w-60pct)

**"$R$-form"** similar to *decorated cospan* ([Fong 2015](http://www.tac.mta.ca/tac/volumes/30/33/30-33abs.html)): object $x \in \mathcal{X}$ and cospan in $\mathcal{A}$ of form

![Structured cospan in R-form](structured-cospan-r-form.svg# w-60pct)

---

# Structured cospans of C-sets

**Construction**: Take the *discrete $\mathcal{C}$-set functor*

$$
L: \mathcal{A} \to \mathcal{X}, \qquad
L(A)(c) := \begin{cases}
  A &\text{if $c = c_0$} \\\
  \emptyset &\text{otherwise}
\end{cases}
$$

where

* $\mathcal{A} = \mathbf{FinSet}$
  * $\mathcal{X} = \mathcal{C}\text{-}\mathbf{FinSet}$ for some small category  $\mathcal{C}$
  * $c_0$ is an object in $\mathcal{C}$ with no outgoing morphisms.

Right adjoint $R: \mathcal{X} \to \mathcal{A}$ is the *forgetful functor* $X \mapsto X(c_0)$.

---

# Open C-sets

An *open $\mathcal{C}$-set* is a structured cospan for such a functor $L$.

**Example**: In *open graphs* take $c_0 = V$, so that the functor $L: \mathbf{FinSet} \to \mathbf{Graph}$ is

$$
\begin{aligned}
  L(A)(V) &= A \\\
  L(A)(E) &= \emptyset.
\end{aligned}
$$

In code, we usually specify open $\mathcal{C}$-sets in $R$-form.

```julia
const OpenGraphOb, OpenGraph =
  OpenCSetTypes(Graph, :V)

x = Graph(4)
add_edges!(x, [1,1,2,2], [2,2,3,4])

M = OpenGraph(x, FinFunction([1], 4),
              FinFunction([3,4], 4))

gv = show_graphviz(apex(M))
```

![Open graph](open-graph.svg)

```julia
Tuple(feet(M))
```

```
(FinSet(1), FinSet(2))
```

---

# Open C-sets as a hypergraph category

The usual operations are supported:

* composition, via pushout
  * monoidal product, via coproduct
  * identity, braiding, ...

```julia
y = Graph(4)
add_edges!(y, [1,2,3], [3,3,4])

N = OpenGraph(y, FinFunction([1,2], 4),
              FinFunction([4], 4))
```

![Another open graph](open-graph-2.svg)

<br>

```julia
using Catlab.Theories

P = M⋅N
```

![Composite of open graphs](open-graph-compose.svg)

```julia
Q = M⊗N
```

![Product of open graphs](open-graph-otimes.svg)

---

# Biased vs unbiased notions of composition

Decorated/structured cospans are usually viewed as hypergraph categories or symmetric monoidal double categories.

* These structures are **biased**: defined by a finite set of primitive operations of fixed arity ($\circ$ and $\otimes$ are binary operations)
  * They are also **directed**: inputs are distinguished from outputs

While mathematically convenient, both cause practical difficulties:

* Compositions must be decomposed into primitives, either

* by hand, which is tedious and error prone, or
      * algorithmically, which is complicated and slow
  * Directionality is artificial: must deal with cups and caps

---

# The operad of undirected wiring diagrams

*Operads* solve the first problem, and the *operad of undirected wiring diagrams* ([Spivak 2013](https://arxiv.org/abs/1305.0297)) solves both problems.

![Example UWD](example-uwd.svg# center)

<small>(Outer ports of UWDs are drawn as dangling wires.)</small>

---

# The operad of undirected wiring diagrams

A UWD represents a general composition in a hypergraph category.

UWDs are the natural unbiased syntax for:

* tensor networks
  * conjunctive queries of relational DBs
  * formulas in regular logic ([Fong & Spivak 2018](https://arxiv.org/abs/1812.05765))
  * composites of multispans or multicospans
  * composites of structured multicospans

---

# Undirected wiring diagrams as C-sets

**Definition**: An *(untyped) UWD* is a $\mathcal{C}$-set for the category $\mathcal{C}$ generated by

![Schema for UWDs](schema-uwds.svg# w-50pct)

A *typed UWD* is a UWD with compatible typings of ports and junctions.

```julia
@present SchemaUWDs(FreeSchema) begin
  Box::Ob
  Port::Ob
  OuterPort::Ob
  Junction::Ob

box::Hom(Port, Box)
  junction::Hom(Port, Junction)
  outer_junction::Hom(OuterPort, Junction)
end

@present SchemaTypedUWDs <: SchemaUWDs begin
  Type::Data

port_type::Attr(Port, Type)
  outer_port_type::Attr(OuterPort, Type)
  junction_type::Attr(Junction, Type)

junction ⋅ junction_type == port_type
  outer_junction ⋅ junction_type ==
    outer_port_type
end
```

---

# Structured multicospans

**Definition**: Given a functor $L: \mathcal{A} \to \mathcal{X}$, a *structured multicospan* is a diagram in $\mathcal{X}$ of form

![Structured multicospan](structured-multicospan.svg# w-80pct)

where $a_1,\dots,a_n \in \mathcal{A}$ and $x \in \mathcal{X}$.

Structured multicospans (or their isomorphism classes) are an *algebra* of the operad of $\mathcal{A}$-typed UWDs.

---

# Composing structured multicospans

Any undirected wiring diagram typed by objects in $\mathcal{A}$...

![Composition as UWD](composition-as-uwd.svg# w-50pct)

...specifies a diagram in $\mathcal{X}$ over which to take a colimit.

![Composition as colimit](composition-as-colimit.svg# w-80pct)

---

# Structured multicospans in Catlab

Define UWDs using the `@relation` macro:

```julia
uwd = @relation (a,d) where (a,b,c,d) begin
  X(a,b)
  Y(b,c)
  Z(c,d)
end

gv = show_graphviz(uwd)
```

![UWD for ternary composition](ternary-compose-uwd.svg)

Compose according to UWD-algebra using `oapply` function:

```julia
P = oapply(uwd, [N, M, N])

graph = apex(P)
gv = show_graphviz(graph, node_labels=false)
```

![Ternary composite graph](ternary-compose-graph.svg)

---

# Petri nets as attributed C-sets

Schema for Petri nets as C-sets ([Kock 2020](https://arxiv.org/abs/2005.05108)) with attributes for labels, rate constants, and initial concentrations:

![Petri net with rates schema](rxnet-cset.svg# w-75pct)

```julia
@present TheoryReactionNet(FreeSchema) begin
  T::Ob
  S::Ob
  I::Ob
  O::Ob

Name::Data
  Rate::Data
  Concentration::Data

it::Hom(I,T)
  is::Hom(I,S)
  ot::Hom(O,T)
  os::Hom(O,S)

tname::Attr(T, Name)
  sname::Attr(S, Name)
  rate::Attr(T, Rate)
  concentration::Attr(S, Concentration)
end
```

???

Extend the definition of whole-grain petri nets

Attributes for labels on both states and transitions

Attributes for initial concentrations on states (natural numbers)

Attributes for rates on transitions (positive real numbers)

We refer to these as reaction networks

In Catlab, the morphisms of C-set Petri nets are natural transformations but in the colimits we use for structured cospans, all the morphisms involved are actually etale maps, as described by Kock.

---

# Epidemiology primitive reactions

All basic epidemiological models can be constructed from two basic primitives

```julia
const LabelledRxNet = LabelledReactionNet{Number,Int}

spontaneous = LabelledRxNet((:I=>10,:R=>0), (:rec=>.25,:I=>:R))
```

![Spontaneous Petri net](spontaneous_petri.svg# center)

```julia
transmission = LabelledRxNet((:S=>100, :I=>10, :E=>0), (:exp=>.9, (:S,:I)=>(:E,:I)))
```

![Transmission Petri net](transmission_petri.svg# center)

???

Basic epidemiology models can be build from two basic primitives

A spontaneous change in a state (i.e. recovery, disease progression, etc.)

A state causing another to change (i.e. infection, exposure, etc.)

* In the case of infection the first state is changed to become the infection state that is causing it to change

Explain the notation/API for defining these blocks

Define these primitives using helper functions that make it easy to make different versions

---

# Epidemiology theory

```julia
@present Epi(FreeHypergraphCategory) begin
    (S,E,I,R,D)::Ob

infection::Hom(S⊗I, I)
    exposure::Hom(S⊗I, E)
    illness::Hom(E,I)
    recovery::Hom(I,R)
    death::Hom(I,D)
end

sir = Epi[:infection] ⋅ Epi[:recovery]
```

<br> ![SIR model directed wiring diagram](sir_wd.svg# ph-5) ![SIR model Petri net](sir_petri.svg)

???

Explain the theory and the available Homs

Describe how each hom is an example of one of the two primitive reactions

describe the sir model created with composition and the resulting wd/petri net

describe how the fuctor is defined as a dictionary of what primitive cospan you want to want to replace each hom with

---

# Directed vs undirected composition

```julia
seir = @program Epi (s::S, i::I) begin
  e = exposure(s,i)
  i2 = illness(e)
  r = recovery([i,i2])
  return r
end
```

![SEIR model directed wiring diagram](seir_wd.svg# pv-5) <br> <br> <br> <br>

```julia
seir = @relation (s,e,i,r) begin
    exposure(s,i,e)
    illness(e,i)
    recovery(i,r)
end
```

![SEIR model undirected wiring diagram](seir_uwd.svg# w-45pct)

???

Describe SEIR model, exposure phase before illness

Two methods of constructing these models

* Either as directed wiring diagrams using structured cospans and the `@program` macro
  * Or undirected using the `@relation` macro and resource sharing

In the undirected wiring diagram instead of cospans we have multispans where each leg points to one state

In the undirected wiring diagram, you don't have to worry about merging and copying and in the case of petri nets it is easier to make models because they operate on resource sharing

Explain difference in SEIR model definition, keeping track of 2 `I` wires and merging them

---

# Hierarchically defined models

```julia
sir = @relation (s, i, r) begin
    infection(s, i)
    recovery(i, r)
end
```

![SIR model undirected wiring diagram](sir_uwd.svg# pl-4 pr-5 dib vam) ![SIR model Petri net](sir_petri2.svg# dib vam)

???

Next we are looking at using this operadic approach to define models hierarchically

Reintroduce SIR model defined as an undirected wiring diagram

Notice here we are using regular whole-grain petri nets, so no labels, concentrations, or rates