Sunday, March 22, 2020

Nerding-out on Covid19

It is interesting that we can get a daily account of the number of cases of Covid19. For the most part we get a reporting of the total number of cases and the number of new cases. 

Being I live in Minnesota, I track this data the most. As reported in the news, testing is limited and the data likely does not reflect the true coronavirus infections; testing seems to focus on capturing the worst cases. Still, there are some interesting things to tease out, even if you're playing with just the news reported data.

My disclaimer here is that I'm doing minimal analysis and really when it's this early in the "pandemic" I'm trying not to let the "Belief in Small Numbers" [Kahneman and Tversky 1971] cause me to jump to any conclusions.  Also, I apologize that my graphs aren't as fancy as, say, the Star Tribune's or the Johns Hopkins app. Frankly, I'm feeling kind of lazy towards a fancy visualization tonight.

Looking at the total number of cases

Fig. 1, the overall MN cases as typically presented.

This is the first graph that catches my, and what I'm assuming are most people's, eye. Plotted as a function of date (Fig. 1). As you can see, the number of cases starts really low and stays there for roughly 10 days, after-which the total cases begins to increase in that "Hockey Stick" manner that has been described. 

This trend is typical of an outbreak or infection as each sick person is capable of spreading to multiple others and the process accelerates. Naturally, as you run out of people to infect the data flattens out again. Such a trend is referred to as a sigmoid distribution. 

When covid19 is behind us this graph will have a lovely "S" shape. The challenge with this type of data in the current situation is our ability to control how steep (or drawn out) this curve is. As there are no treatments, vaccines, etc available the end result will be that the number of cases will keep growing at a certain rate. 

As we see in Italy, if this curve grows too rapidly we run out of the ability to care for and treat those who are more critically infected and the death toll rises. There are only so many beds, ventilators and medical professionals available. Thus, while we are statistically stuck with the sigmoid distribution it is important to try our best to stretch out its growth so that we can optimize our resources. It may even enable one of these "hope" treatments to prove that it works and we can control how the tail, i.e. how many total cases there are. 

So my curiosity is, is the stay home/stay away policy working? It's very hard to tell on this graph as it tilts up and then people throw down the paper stomp away hopelessly and there aren't a whole ton of data points yet. The trouble is in a few more days we can be over our heads and living like the folks from "The Walking Dead".  

Fortunately, the same data can often be looked at in different ways and it MAY tell you a little something more. Since the number of cases are growing by some exponential-ish function the standard linear coordinates of the graph may be obfuscating changes in the sigmoidal behavior.

Fig. 2, overall MN Covid19 cases plotted on a Log-Linear scale

If we plot the number of cases with a vertical scale that is logarithmic (Fig. 2), so that each major line represents a 10x increase in cases, we see something drastically different. (Note: I am now plotting as number of  days since the first reported case). Here, up through day 11 the number of cases increases with a certain steepness (rate). However, after day 11 the data becomes less steep indicating a change in the character of the sigmoid function. The rate began to slow down!

Now, I have no clue if this is enough of a slowdown to keep our resources in check, but it's a noticeable improvement in the data. The original trendline would result in 1,000 cases before the 20th day. If things stay on course, the 1,000th MN case will be sometime around April 4th. Even that extra ten days will likely save a lot of lives. 

 So what was the magic here to bend the curve? Day 11 is Sunday, March 15th. After this day a vast majority of schools were closed. Restaurant closures began the next evening.

Probably not worth it from a health standpoint, but it would be interesting to have enough data to see the impact of the school closures versus the business closures. It does look like one or both (along with a lot of other variables, such as flight travel reduction) seemed to work to an extent.

Population Density

The social distancing we are being asked to practice is an attempt to control one main thing: population density. Indeed, a physical model of covid19 propagation as a function of inter-personal distance is likely being developed and will guide what additional steps need to be taken. Something to the like of:

 "Assume an array of spherical people (physics joke) in a 2D hexagonal array with spacings between people of x meters (okay feet for you 'mericans). Also assume the probability of Covid19 infection decreases by 50% per day for every meter of separation. Starting with one covid19 infected person (and probably a periodic boundary condition) find the optimal spacing to ensure enough medical resources for six months. Now add two twits who walk around the array randomly looking for toilet paper; what is the new required separation accounting for Mr. and Mrs. Whipple?"

The Star Tribune MN map shows the number of cases in each county. From there it's pretty simple to look up each county's population density (in people per square mile, I'll call ppsm). Plotting the current over number of cases versus population density (Fig. 3) shows another, hopefully obvious, trend: fewer people close together = fewer cases. This means the rate of transmission decreases. Barring weary travelers carrying the plague, a low enough population density would even extinguish the virus.

Fig. 3, overall number of MN cases versus county population density. 
The two highest density counties are Hennepin and Ramsey and have the highest number of cases. Drawing very guesstimate trend lines (much more data is needed to call this an actual trend) on the graph shows a crossing point at 1000 ppsm which could be a ppsm-related tipping point for transmission. 

Again, there are more variables for this type of trend line and this is highly non-statistical/scientific at this point, but there is the idea: stay home and away from others and the transmission slows!  This is tougher in the more urban areas as there are people everywhere, but it is extremely important.

That's enough fun with graphs for one day. Stay home, stay healthy and try to enjoy the little breather (pardon the pun) that life's given us at the moment.

No comments:

Post a Comment