Tuesday, July 7, 2020

Figures Don't Lie, But Liars Can Figure


Updated on 7/8/2020 - see below

And sadly I must report that the San Diego County Health Officials (and the CA State folks, up to and including Goober Newsom) have been 'figuring' (or perhaps just showing their ignorance).

First a bit of background.  When Goober Newsom allowed the various counties around CA to 'open up' (albeit slowly), he put some 'metrics' in place to monitor the counties behavior/results.  Now at first blush this seems a reasonable and responsible action.  However, the metrics (set in mid April) have not been modified to account for the changing reality, and as a result are now being applied, I believe improperly, to shut businesses down in San Diego County (and I must wonder if in other counties as well).  I have been saving lots of the historical CoViD data for San Diego County and I dug into it to determine if they are comparing apples to oranges.  And the result is (drum roll)...

They are comparing apples to squirrels.  The data today simply can not be compared in a straight forward way to metrics based on data from back then for a myriad of reasons.  I will now attempt to convince you, dear reader, that I know what I'm talking about.

The metric which has caused us to go on the Goober's 'watchlist' is that we have exceeded the allowable average daily count of new cases (a 14 day running average which is also divided by the county's population - in 100,000's, for San Diego that means it is divided by about 33).  Now if that is unclear, please excuse me, but that's the metric.  When the metric exceeds 100 for 3 consecutive days, the state starts to 'watch' and if it stays above 100 for 3 more days, then the state orders all 'indoor' business to close for at least 3 weeks.  But why?  If we are aiming to someday reach herd immunity (and so long as we lack a vaccine, that's the only path available to a general return to normalcy) then having the cases rise speeds the process.  Of course I would prefer we do this as safely as is reasonably possible, so it is reasonable and prudent to try to keep cases, hospitalizations, and ICU admissions down, but really we should be striving to keep deaths down.  You will note that the metric simply does not take this into account.  Additionally, the metric does not take into account the tremendous rise in testing since those early days.  Perhaps they were simply trying to get a surrogate for deaths, not after people had died, but in metric that would predict deaths.  At first blush cases seem a reasonable choice.

As of the day they set the metric, the death rate was running at about 6% in San Diego County*.  At the time the highest number of deaths in a week was 40.  If we divide that by 33 to get it per 100,000, we get 1.2 deaths per 100,000 per week, at the maximum.  But the cases were running at about 570 per week, and that's about 17 per week per 100,000.  So let's work with the assumption that they were prepared to accept the deaths that would be produced if the case rate rose to the 'magic' 100 per week per 100,000.  That works out to about 1.2*100/17 or about 7 per day per 100,000.  So far, while the metric is convoluted, and only marginally predictive, it would be not insensible.

But how does that apply to today?  Somewhere between poorly and utterly and completely irrelevant.

Why?  Because the demographics of those getting the disease has changed markedly, the testing capacity has seen a major growth, and, I suspect, the treatments are better.  Let's agree to ignore the last possibility and focus on the first two.  As to the number of tests, back in mid-April we were running around 1,100 tests per day in San Diego County.  Today that average is more like 6,700.  If the actual fraction of people infected stays the same, and the number of cases is substantially larger than the number confirmed, then the total testing positive would be a fairly constant ratio of the tests, so we'd expect to see something like 6 times the number of cases (and it is  a factor of 4.5, not too far off). Additionally, as contact tracing has improved we'd expect to see even more of the asymptomatic cases be identified, so just using case numbers is nonsense! And we haven't even started on the issue of the demographic change.

The demographics of the cases for San Diego County in mid-April consistently showed that the three most susceptible age groups, 60-70, 70-80, and 80+ accounted for 25% of the cases, and nearly 90% of the deaths (with the under 40 demographics accounting for about 35% of the cases, but only 2% of the deaths).  So if we really want to keep the deaths down, we need to worry about the 60 and over crowd (and be somewhat concerned as to those 40-60).

Well we have the data, so what's the death rate today and what do we expect to see for the next 2 weeks (based on today's cases). As of the last three days we have seen a total of 0 deaths.  OK, even I don't believe that.  It's probably something of an artifact, likely caused by the holiday weekend.  Let's look instead at the deaths for the 7 days prior.  Total deaths: 27.  After we divide by 33 (to get in per 100,000) the total is less than 1 (it is just under 0.79).  Not even as high as the totals we'd seen before, let alone jumping up towards 7.  How about the projection for two weeks out (based on the fact that there were 3392 cases reported in the week 6/28-07/04)?  Well we need to break these down into their demographic groups.  Fortunately, I have that data (it is in Table 1 below for those who want to check my math).  There it is, the projected deaths per week per 100,000 in two weeks is likely to be less than 2.  Again, no where near the 7 they seemed to be willing to tolerate.  Given that, the closure of businesses is completely unwarranted.

             
UPDATE: Note added as proof.

It strikes me that if my method of predicting deaths from cases has any validity I should be able to take the case data from two weeks ago and predict the deaths as of today.  I can't believe I didn't think to do this yesterday.  Anyways the data are shown in Table 2 below.  The prediction is a total of 32.74 deaths for the week ending July 4.  The observation was 27.  Anyways that seems close enough to me.  It also suggests that prediction for 2 weeks from now is likely to be high by about 20% (this could be due to lots of things, but I'd bet on most of it being due to more cases that are asymptomatic being found through contact tracing, which would have simply gone as undetected previously, increasing the case count, but not contributing to the death total).  I won't be surprised to see deaths for two weeks from now to come in at around 50 (or about 1.5 per 100,000).  I also added Table 3, which is the prediction for the deaths for the week ending 7/11/2020.  (Again, I can't see why I didn't think to do this earlier.)  So we'll see how this works. Oh, and that's for data ending before San Diego got on our Goober's Watchlist and it is slightly higher than the prediction for deaths for the week that cased us to go on the list, but still essentially 2 / 100,000.

             

* - this number is what I get by taking an average number of new cases a little before the date the metric was set and an average number of deaths over the time period 14 days after that.  It's crude, but it isn't unreasonable.

             

TABLE 1: The death rate for each age group is calculated by taking the number of deaths (as of 7/4/2020) dividing those by the number of cases two weeks previous to that (as of 6/20/2020).  The new case totals are for the week ending 7/4/2020, so the death prediction is for the week ending 7/18/2020. The total number of deaths would be expected at about 61 for that week (compared to about 40 above).  That's still less than 2 / 100,000.

Age group New cases  Death rate   Expected Deaths 
0-10
110
0/254
0
10-20
267
0/558
0
20-30
1109
3/2152
1.55
30-40
674
4/2000
1.35
40-50
446
12/1633
3.28
50-60
381
29/1810
6.10
60-70
235
60/1129
12.49
70-80
93
93/645
13.41
80+
74
186/604
22.79
 Total Expected deaths 
 60.97 ≅ 61 

             

TABLE 2: The death rate for each age group is as in Table 1.  The total cases by age group is as of the 7 days ending 6/20/2020.  The expected deaths are for the week ending 7/4/2020.  The total number of deaths would have been expected at about 33 for the week (compared to 27 observed).

Age groupNew cases Death rate  Expected Deaths 
0-10
48
0/254
0
10-20
122
0/558
0
20-30
360
3/2152
0.50
30-40
221
4/2000
0.44
40-50
173
12/1633
1.27
50-60
207
29/1810
3.32
60-70
125
60/1129
6.64
70-80
53
93/645
7.64
80+
125
186/604
12.93
 Total Expected deaths 
 32.74 ≅ 33 
 Total Observed deaths 
 27 

             

TABLE 3: The death rate for each age group is as in Table 1.  The total cases by age group is as of the 7 days ending 6/27/2020.  The expected deaths are for the week ending 7/11/2020. 

Age groupNew cases Death rate  Expected Deaths 
0-10
83
0/254
0
10-20
210
0/558
0
20-30
734
3/2152
1.02
30-40
483
4/2000
0.97
40-50
341
12/1633
2.51
50-60
269
29/1810
4.31
60-70
201
60/1129
10.68
70-80
116
93/645
16.73
80+
98
186/604
30.18
 Total Expected deaths 
 66.40 ≅ 66 

3 comments:

K T Cat said...

As usual, you supply the rigor in abundance. I had the feeling this was the case as the deaths didn't seem to be increasing. Since the demographics of the protests, such as they were here, and nightlife skew heavily towards the young, it's no surprise that opening up spread the Wuhan Flu among the kids. As you say, this is exactly what we want. We get lots of inert nodes in our network in exchange for almost no deaths or hospitalizations. It would seem that this is the perfect solution.

Then again, what do I know? I think women can't become men. I am unlearned in the ways of SCIENCE.

Ohioan@Heart said...

I had to post... I was losing my voice screaming at the TV every time some politician or talking head came on and spewed the ‘party line’ on the cases metric. Not that it will do anything to change anything, but at least I vented, rather than just blowing my top. Besides, Mrs Ohioan was starting to get grumpy about having to hear the one-sided debates.

Ohioan@Heart said...

Wow. It makes a lot of sense. It also suggests that the burn out is most likely due to "undetected" infections making the actual rate so high that herd immunity has been achieved. That would require an undetected to detected ratio of something like (5 or 6) to 1 (total infections then being 75-90 %).

With deaths coming in at 5-600 / 1,000,000, and the number of observed cases being at 150,000 / 1,000,000 the apparent death rate would work out to 0.33 - 0.4 %. But since total cases would actually be 750,000 - 900,000 per 1,000,000, the actual death rate is from 500/900,000 - 600/750,000 or 0.055 - 0.08 %