How Do You Simulate a Pandemic?

A conversation with the creators of ‘People of the Pandemic,’ Shirley Wu and Stephen Osserman

Madison Hall
Nightingale
25 min readMay 15, 2020

--

When the last pandemic was 102 years ago, how would you explain to people that their actions have consequences? Would you make an interactive game? Would it be crass to even call it a game? With something so pressing and all-encompassing in the news, it seemed obvious to build some form of explainer of COVID-19, but this very notion was fraught with the challenge of creating an educational piece on a virus which many leading medical researchers do not fully understand.

These questions plagued Shirley Wu and Stephen Osserman as they created People of the Pandemic, a new pandemic simulator showcasing how the choices you make during a pandemic can affect your entire community.

Wu and Osserman’s simulator places the past 19 users’ choices in conjunction with your own decisions to form a visualization of any ZIP Code’s projected health outcome from COVID-19 alongside data from local hospitals such as the number of available beds and probability of recovery from the virus.

I stumbled upon the simulator while casually scrolling through Twitter, where Wu had posted a link. As both a fan of Wu’s previous work and someone having a difficult time staying up-to-date with new developments regarding COVID-19, I gave an attempt at the game. I tried to be as safe as possible to the point where I promptly ran out of food, ending the game (sadly, no deliverable groceries in this virtual world like my family currently relies on). But after playing multiple times, sometimes failing or putting myself more at-risk than usual, I had a much better understanding of the virus’ effect on the world around me and how truly flattening the curve requires the efforts of more than one individual — it takes a community.

MH: What are each of your backgrounds and how did you develop the skill sets to work on this project?

Shirley Wu (Freelance front-end information designer): For me, my background is that I studied business in college. I didn’t quite like the options I was seeing when I graduated, so I started taking computer science classes and absolutely loved it. And then when I graduated I went to a big data company and I was a front-end engineer there for a while. That’s how I found out about D3. After finding out about D3, I started to get more involved in the community and I learned that there was a cool thing called data visualization. That’s how I found my way there. I guess I’ve been here for eight years — wow. I just really enjoy data visualization just because it’s a combination of all of the things I really, really love. I always loved math, I love to code, and I did art for 14 years before university. So I really enjoy the kind of artistic design side of it too.

Stephen Osserman (Data methodology and spatial statistics): I come at this from somewhat of a circuitous path. I have a background in music and cooking worlds as well as kind of community advocacy work. I ended up in a data science career around 12 or 13 years ago and got really interested in exploratory visualizations. Actually, I like that often the first thing you look at from a data set doesn’t tell you close to all the things within it, and I found exploratory visualization incredibly helpful for mainly making visualizations for myself to develop hypotheses to test. So, that was useful. Then I took a detour into getting a Masters of Spatial Analytics for Public Health. That is a mix of spatial statistics and cartography and GIS. There’s fairly little focus on that, well cartography was a part of that, but a small part of that. But I’ve gotten very interested in, I need to come up with a good term for it, but I guess like analytical cartography or something like that on how to explore data sets through spatial visualizations.

But anyway, so then my coming to this in data visualization has been fairly new for me in the last three or four months. I’ve been learning more of the front-end and D3-type stuff before that. I’ve mainly worked in R and worked with my own models and not on the internet. So yeah, I’ve been loving getting involved and just got involved in DVS this year and it’s been really interesting kind of seeing this world that I wasn’t fully aware of before.

MH: How did you two connect to work on this project?

SO: I had done a few hours of work to create a quick crack at a visualization at a point when it seemed like a lot of people weren’t getting the social distancing concept and flattening the curve much. I posted it, like a very prototype-y D3 simulation thing, in the healthcare channel [in the Data Visualization Society Slack]. Shirley said they were already working on something that was very much overlapping with that and asked if I would be interested in talking. Then it grew and grew and grew from there and beyond.

MH: What were each of your roles in creating and building this project?

SW: Yeah, so I reached out to Stephen before I had got involved. So that was around when I only had the idea of wanting to do something from a spatial social distancing perspective. My background is primarily in front-end — so data visualization for front-end web development, that’s my experience, and from there I had a little bit of data analysis and then primarily design and code. For this particular project I did all of the data visualization, design, and coding. Basically, the design of all of the game page and then all of the data visualizations and then the software infrastructure for making sure that even though the decisions that the players have to make are pretty simple, just about all of those propagate to the rest of the UI on the game page correctly.

And then also kind of project managing all of the communication between everyone. Also the story. So kind of everything involving the actual visualization part as well as a story around the data.

SO: I was working on how to develop a model that could be adjusted based on game players’ decisions in a way that was not incredibly misleading. And yeah, so I guess developing the simulation methodology and then making sure that all of the decision-making inputs that Shirley was developing connected with how that affects the community in the data model. Although as Shirley said, the actual coding of that was Shirley’s part. The first prototype I developed had something similar of people in homes moving to businesses and back on a daily basis. I think both of us kind of settled on a similar concept at the front-end, but Shirley actually turned that into something visually pleasing and useful to engage with for people.

MH: Were there any simulators or games that inspired you along the way for this project?

SO: Not directly from a simulation perspective, I would say. I’d say we started working on this at a point when there were a lot of very simplified explainers starting to be put out in mainstream publications. Actually just a couple of days before a lot of those hit, there was a lot that was methodologically very useful that we leaned on heavily and some good explanations like that that I think were helpful as we were getting further along. But initially, I think it was just kind of something that we both saw a need for.

SW: Yeah. I remember I reached out to Stephen once I saw his simulation just because they looked exactly like how I was imagining it in my head. And I was like, “This is it!” I think that was around the second week of March. And I think that was a few days after the Washington Post had published their explainer on social distancing. I think it was about flattening the curve and the simulation was the balls that were kind of like moving and once they collided, they were infected. I think we met on DVS a few days after that publication.

A screenshot of the Washington Post’s virus simulation

But even before then I had started thinking about the project, like about the idea itself, like three or four days before that publication. So I think that publication was the closest to what we might have had in mind. And actually I think in turn, after seeing that, I was like, “Well, do we still need to do what we’re trying to do?” So that was a very, very helpful explainer. But then I think ultimately we decided to go forward just because I think it’s always helpful to have different angles on the same topic.

SO: I’m not sure if you know this, but the day I went on to post that prototype, I had never posted anything. I got my nerve up to post something and I logged on and the Washington Post thing had come out and I really came very close to just not posting mine. I was like “This is done now. Like, I’m not going to publish.” I’d worked on it like it was the same, it was like the exact same timescale that I think everyone realized that [flattening the curve] needed to happen. However, they had the ability to turn those around in two days. What we were trying to do took over a month, but yeah, it was right exactly at that time.

MH: Can you walk me through the different software, packages and programming skills used to design and create the game?

SW: So from a front-end perspective we used D3. I always use D3 in my projects. For the visualization part, I used Vue for the web app and managing all of the different, “if a user interacts here, then the UI reacts this way.” So I use Vue for that and GreenSock to manage all of the animations throughout the week. Those are the big ones. I always use Lodash, which is a package that’s really nice for data manipulation. We also used Firebase for the back end. But I think those are the major ones for creating the game itself.

MH: And for the data models, what were you using for that?

SO: Mainly D3.

SW: And I guess we also wrote the methodology. We were still working through the anthropology on them.

MH: So was it just mostly a back-and-forth between the two of you? Can you walk through that a little bit?

SW: Yeah, so I think early on another kind of partner that got involved was Egghead.io. They’re an online education platform. But essentially, I was supposed to work with them and then I was like, “Hey sorry, I’m going to be really heads down on this.” And they just said,” Can I help you? Like, can we help you on that?” And so they just full-on offered full production support.

And one of the things that I was really grateful for is that they set us up in Basecamp which I’ve never used before. That was really amazing in terms of keeping track of the project. Egghead was helping us with the non-data-visualization UI side of it. So a lot of the communication happened there. We tried to keep a lot of the big data decisions and big simulation decisions in Basecamp so that everybody could know what was going on. But for the small things that were just one-on-one between Stephen and I, I think a lot of it was in our DVS Slack messages.

SO: It started out with us independently. Neither of us were interested in working on things that we would have to figure out how to integrate. So we’re working pretty closely together to figure out what the logic of the front-end was going to be so that we would know what the logic of the back-end needed to be. And there was some amount of change to our approaches. I guess the translation from D3 to Vue was the biggest thing. But besides that we were pretty much making sure that we were building things that would be compatible with the final product.

MH: The main interactive part of the simulation is the sort of decision tree of, “I need to go for a walk. I need to go get some groceries, etc.” How did you decide on those different parameters and what was that process like?

SW: So that one is actually a little bit misleading and we’re actually right in the middle of refactoring that and refining that part. From a simulation perspective, the important thing is that we know whether there’s a decision of whether people go out on a given day or they decide to stay in on a given day. So currently as the model is, the important thing is we know how many times they decide to go out per week. And so we’ve gotten feedback that actually the activity part is misleading because we’re not actually using it in our model. So we’re actually refactoring that to be more activity-based. Like, you can choose that you want to go out for a walk three times this week and then you go for groceries once or something like that and then we’ll weigh that appropriately.

Original slider to choose how many times to go outside per week

So I guess the answer is that it’s more than the simulation. The model came first and we tried really hard to figure out how the UI and the story could fit into that model. And then based on feedback we realized that the UI was quite misleading. So now we’re refactoring not just the UI, but the model so that it could go with a decision-making process that is more natural-feeling to people. People have told us that right now the decision making process is really nice because that is how they think about going for a walk or going for groceries or something like that. But then the experts have told us that the slider is misleading. So it is a refining process. (The simulation has since been updated).

Updated slider to choose weekly activities

SO: Yeah. I would just add to that that it was as Shirley said, the model is just how many times each person in the simulation leaves dictates how much contact there is, which dictates how much virus spread there is. One of the big tasks was how to make this feel relevant and engaging without way over-complicating the model and making it so we just never pushed it out. We kind of knew this was an imperfect solution because we knew that, like Shirley said, those different activities aren’t equivalent, and in the model in the back-end, they were. But we decided that it was better to get something out as long as we were transparent about its limitations in the methodology and elsewhere and give ourselves the opportunity to refine after pushing it out rather than waiting forever to get it out there because we felt like it was already past the first moment that would have been most useful and we were in a second moment that might be useful and we didn’t want to miss that while spending forever refining. So we decided to publish, knowing it wasn’t perfect.

MH: Why did you choose to incorporate the 19 past players’ decisions into the game? Was that at an arbitrarily chosen number or is there statistical significance to that number?

SO: It was a nice compromise between what felt like the number of people you can imagine easily and enough people to not feel like it’s a two-player game. But Shirley, I don’t know if you want to mention the inspiration for this.

SW: Yeah, so we went back and forth on this a lot. So I guess let me backup a little bit. The inspiration behind it was actually, are you familiar with The Pudding? They do really great work.

They had a piece called “The Birthday Paradox” where they went and said “In order to explain this paradox, which is about the chances of someone else having the same birthday with you at a party, we’re going to have a virtual party and invite the last N number of people that visited this website.” And I was like, “That’s an ingenious idea. It’s like asynchronous play!” And so when we were talking about how really one of the most important parts of the message we need to tell is that individualistic behavior is not what we want and that it’s community based behavior that is what we need right now.

And so this kind of like inviting the last number of people that visited to participate in your experience seemed like a really good way to bring it together. And like Stephen said, 20 total people is what we settled on after a lot of debate. I think Stephen described it to me really well and said that whether we decide to go out or not, the effect of our decisions is like the voting problem and that us individually doesn’t have that much of an effect on the outcome, but then if we as a whole community decide to do one way or another, that’s huge. We had a lot of long discussions about whether we should stack the favorite towards the individual and I think we were trying to decide if the person should have between like 20, 40 to 60, 40 to 50 percent influence or something like that so that they can see the consequence of their individual action or if they should have kind of less influence and it’s a little bit closer to reality.

SW: And then eventually, this was the compromise in that they aren’t representing individuals. They are representing more than just themselves, but it’s not so overwhelming of a number. So no, there’s no statistical significance.

SO: As Shirley was saying, one of the biggest challenges was when you’re putting out an explanatory piece that has very limited interaction, it’s very easy to know what lessons people are going to take from it. In this, there’s kind of enough different variables both in terms of what local circumstances are in different places and what decisions they make and what decisions other people make that we wanted to be very careful about making sure that we weren’t putting out messages that would be detrimental to public health. So it’s struggling to figure out the right sets of choices people could make and the right game environment overall so that we wouldn’t be giving people the impression that it doesn’t make a big difference if they don’t go out or if they do go out. Especially given that in reality it’s not about the individual. We’re all taught to think about this as individuals, but actually what decisions that the community as a whole make is going to impact the decision.

Original graph showcasing community decisions

Once the community has made that decision, you can take any individual and change their behavior one way or another and it won’t make a bit of difference. But so, like, kind of how to balance that lesson for the individual and make it relevant to the individual given that collective context was why it took so long to get out. There’s a lot of the technical stuff that took a decent amount of time, but really trying to figure out how to make that work was the hardest part of this I think, and led to the most delays.

Updated graph showcasing communal decisions

MH: What were your other struggles when building and creating this?

SO: I mean that one that we were just talking about, like how to balance the individual decisions with the collective was the biggest challenge. Another big one was how to balance making this relative to the moment when everyone’s thinking about the current pandemic. We know that even forecasting teams that have huge resources and tons of analytics for this can’t accurately forecast how decisions will change the flow of the disease, it’s just too complicated. There isn’t enough data about how this particular disease propagates for a simple model to kind of do. So we want to be really careful not to say this as a forecast, but also make it feel relevant enough that it could actually, you know, feel like somebody should consider changing their behavior because of it.

So that was another one. Another part was while I have somewhat of a public health background, neither of us are epidemiologists. Luckily Shirley connected with somebody who is and was great to help give feedback. But, you know, it felt like if it would have been in a different moment when epidemiologists weren’t all super, super busy with this, it would have been great to have more of that expertise on the core team. Another kind of interesting struggle is whether to call this a game or a simulation or something. I guess the final one I’ll just add before that is that we were also always thinking about how to adjust for the next version and at first, our target audience was kind of people who were in a lot of places which weren’t on lockdown yet.

People are making a lot of individual decisions and there was a lot of education about flattening the curve and it was kind of geared towards individuals making decisions. And now, a lot of what I think is impacting what’s happening in different places are more policy and social decisions. They’re about whether workers have sick leave, they’re about whether companies are adequately protecting their workers and about you know, it’s not like a lot of the people going out on a daily basis are going out to get a drink with their friends. They’re going out because their work requires it or because they don’t have another form of income. I think that our initial target kind of shifted from what felt to me at least most relevant in this moment, which was some of those things. And I think that was kind of a challenge to figure out: If and how to incorporate those and what it would mean again for delaying the whole rollout of everything.

SW: Yeah. So just like Stephen said, I think the first half was really about, or well, I think in probably the five or six weeks the challenge was really like, “Should we even be doing this?” Because there was the simulation piece from Washington Post, and we were like, “We don’t have an epidemiologist we’re not even sure about any of these numbers and we’re going to have to make that up. And then, would anyone even find this useful?” So the first half was existential crisis-like questions and the second half, like Stephen said, was all about the story and the kind of the fine balance between the story, the UI, and the model. That includes the struggle with the community versus individualism. How do we get that across?

I think we tried to get that across with the copy. And I think that’s still difficult because not everybody reads everything. On top of what Stephen said about incorporating things more about the essential workers into the model, some other things that we’re trying to do is kind of refine the model more based on the weighted activities. That was the plan that hopefully once they have the activities that the model shifts so that you can kind of feel [apart of the story]. One of the biggest criticisms is like if someone chooses to go to a concert, the impact of that isn’t that much (this has been updated and changed). And of course it’s not because underneath the hood [the statistical model], it’s just a difference of going out seven a week versus like six times a week [what’s weighted in the game].

So once we have a weight given to the implication of going out to a concert, then hopefully that will convince people more that it does have a big impact if everybody decides to go to a concert. Another challenge I keenly felt, I don’t know if Stephen you felt this, but I keenly felt the challenge of being an individual or being a very small group working on this. So, my background is as a freelancer. So I got to dedicate my full time on this, but I think everybody else already had a full time job and so they were working on this part time and putting in a lot of hours, thankfully. But because of that, we couldn’t turn it around in two days like the Washington Post can. Like it wasn’t easy for us to reach out to experts like newsrooms can. So I think that was another struggle I felt personally of like, “Would this process have gone easier if we were part of like a bigger newsroom with all of the resources that came with it?”

Ultimately, I’m really glad we’re not part of a newsroom because I think there’s something valuable to being able to get this information across as kind of independent people instead of a newsroom that already has a reputation for a certain bias. From a technical challenge, there were a lot of technical challenges in terms of the software architecture. Like I mentioned earlier, hopefully the UI feels easy for people to navigate through, but it was actually quite an interesting challenge making sure that everything was calculated, updating correctly and animating correctly.

Hospital beds in the game

The weirdest, biggest technical challenge I had was trying to fit the little beds that we have trying to fit that into the container. And what I mean by that is we have a container whose dimensions can change based on the window. We have a fixed bed icon image, but then the number of rows and columns can change based on the zip code. And so we have to figure out a scaling that will make them fit nicely into these. It was just a lot of variables and that took half an afternoon to figure out an efficient piece of code for it. So from a technical perspective, they were very fun challenges. But like Stephen said, they were solvable challenges versus a lot of these kinds of story UI simulation things.

SO: Yeah, it was definitely just to pile on. It was a huge struggle not being able to dedicate more time to this. I started it three days before my kids’ schools were closed to COVID-19. And then some work responsibilities and kind of doing this at nights when I had the time was hard and there were a lot of things, especially in the interactive to the methodology that if we were in normal times, I would have wanted to do a lot more on before making it public. But I just kind of accepted that it’s going to be what it is. You know, I’m trying to be gentle on myself given the extraordinary circumstances, but that was definitely a big challenge.

SW: Yeah. And I’m really grateful for the time that you did put in for us and everybody on the Egghead.io team too, because they took it on top of their full time Egghead responsibilities, presumably.

MH: What involvement did the Data Visualization Society have in the making of this game?

SW: Well first of all, I think Egghead was just helpful in letting us meet. I’m sure I would have spun in circles even longer trying to figure out how to make this happen. So this was really helpful. Amanda Makulec led the discussions on COVID-19 in the Slack channel. When I was starting to first work on the story, she was really helpful in giving feedback and refining it. This was I think around the same time that I met Stephen. Then I also got to talk to Joshua Smith [in the DVS Slack] and he had a perspective of what it’s like to go through this as someone that is immunocompromised. I started engaging with him because he was sharing his experiences on the “topics and database” channels about COVID-19. I got to talk to him further and it really helped me understand that perspective and that just because we wanted to make sure that this game is kind of lighter, it’s still serious and empathetic.

Once we felt like this was ready for feedback I also put it in the “Shared Critique” channel and a lot of people helped give some of the feedback and things that we want to implement that I mentioned earlier. Once we have that second round of features put in, I think I might share it again in the “Share your own work” channel. So yeah, in that sense it was really helpful.

MH: Do you know how many people have interacted and used the model and simulation so far?

SW: Yeah, so I was being dumb and missed the first 24 hours of analytics, which I think in retrospect, I’m so sad. But I think, and actually this is just off of one tweet I did last Monday, within the first 24 hours there was probably between 5k to 10k unique visits. In the past week, the analytics we had was that I think there’s been 15k or 13k unique visits since. And so I think it’s good numbers in terms of the fact that we just did one tweet for promotion. I think we’re trying to build in some of the feedback and then trying to go back and hopefully promote it some more so that more people will see it. Just because we’ve been told that it’s nice because it’s informative in an engaging way. The other key metric, if we are to believe the analytics website, the average time on-site is currently four minutes, which is I think most websites get between 10 to 30 seconds. So it’s quite incredible, which I think is encouraging because it means that hopefully people are engaging with the whole game. And that it’s working.

MH: What are your takeaways about the virus and what did you learn about the transmission of COVID-19 from working on this?

SO: I think the biggest thing for me was realizing how many subtle variables can make a really big difference in what sorts of interventions are most helpful. So the kind of biggest, you know, trust point is the question of, “Is it better to isolate people who are ill and their immediate contacts and do contact tracing and testing or is social distancing necessary?”

Those kinds of public health decisions are made from such a complex model of not just of the disease’s parameters, but also of society’s. You might’ve seen in articles talking about how Italy skews older, which is maybe why things were worse there than in some other countries. But it’s equally important if older people and younger people go to the same places in a society and there’s huge changes in different cultures and in different areas. Whether there’s a lot of mixing between different age groups and something like that can actually have a huge impact on the spread of a disease.

All of those subtleties, not just of how infectious people are before they’re symptomatic, or how many people are not symptomatic at all, and how long does it last after symptoms go away, like not just all of those things that people kind of know have an impact on how it spreads. All of these very complex elements of social structures is why the best forecasters in the world diverge hugely in what they’re predicting with different outcomes.

I was maybe aware of some of that before, but once I actually started to play with numbers and see how those subtle things really changed all of that was really interesting to me. I also got to learn more about the nature of some of these decisions that have to be made knowing that social distancing will help, but not actually really knowing how much it will help until we have empirical data to tell us. So yeah, I think that was the biggest learnings for me.

SW: So for me, coming into this project I knew nothing about public health. I wanted to be like, “Yeah, well at least I know some things about it.” And I was like, “No, I don’t think I knew anything about public health.” I totally didn’t know anything about epidemics. And so I feel like I learned everything [along the way], and I guess that includes the research I did and was doing. There was like two weeks where all I did was read COVID-19 news and I think I was becoming a horrible person to be around because that’s all I wanted to talk about. And unfortunately, my husband was the only person around, so he took the brunt of it.

So I feel like from this project, I just learned a lot about infectious diseases in general. I think the biggest thing that I learned from this game is, and I have to admit that I’m not sure if that’s due to the more cases that we have, but just how quickly it can spread if everybody’s acting like normal. And the second and the more important thing, actually no, two things: One, the first thing I learned when I was starting on this project was how few hospital beds there are. Like, before this whole thing, I didn’t even think about the number of hospital beds in this nation. And then I read there was 2.8 for every thousand, and I was like, “Oh, okay! Like, that seems like very little.” But I think those kind of relative numbers don’t really stick in my head. And then when I like started looking into all this data and I actually looked up the data from my own zip code where I think we have 67,000 people. We have 4,000-something people above the age of 80 and we have 120 beds in our local hospital. That’s not much! They fill up really quickly in our simulation. And so I think that was the first shocking thing that I learned, which is, we don’t have very many beds. Now we’ve learned that beds aren’t necessarily the important kind of number to go off of. It’s actually about rooms they can isolate in or ICU beds or ventilators, but I think it’s still a pretty decent stand-in.

End game explainer

And the second thing I really learned is that around when we were about to launch, I was talking to my friend and he’s a former journalist. He was like, “Well if you want this to be relevant,” cause you know, we were still building off of the original project back in mid-March when we were trying to be like “social distancing is important for flattening the curve” and that’s no longer relevant when we were trying to publish, right? So he was like, “You know what’ll make this a lot more relevant is if you just at the very end give people an option to see four more weeks if the restrictions were lifted.

So at the end of eight weeks I built this in that at the end of eight weeks, you can choose to see four more weeks if everybody went back to business as usual. When I built that in and I took a look at and I kinda just ran it. I did the whole eight weeks for my own zip code and then I just lifted all of the restrictions and saw how quickly on that line chart [of infections and deaths] the line just went up and up and up. That was also really scary! Now, I really want more people to see this!

For more from Shirley Wu, you can follow her on Twitter and her website.

This conversation has been edited and condensed for clarity.

--

--

Madison Hall
Nightingale

Data journalist and visualization enthusiast @byMadisonhall on Twitter