Global Data Science Forum

Fitting Normal Curve to COVID Data

By Moloy De posted Thu October 29, 2020 09:39 PM

Daily COVID data for India is being reported in the net here. Various CSV files including raw data, daily data, state wise and district wise data are available in the site here. I was having a look at the Case Time Series data or quite some time, possibly from April 2020. The data looks like as follows:

As a Statistician it has been my delight to dig out some interesting insight from the data. I have been trying to predict the rise and fall in the Daily Confirmed column. It has been impossible to predict it till the peak arises around end September in India. There have been several models including SIR (Susceptible, Infected, Recovered) Model and its various extensions in epidemiology to study the spread of a disease. After trying several such ideas I became interested in free hand curve fitting using excel. Looking at the bell shaped movement in the time series choosing Normal Curve was obvious.

The Normal Curve looks as follows:
where A is the peak value as the maximum of the exponential part is 1, B is the location parameter where the peak happens, C is the scale factor. My data spans from 30th January to 26th October and the peak of A = 1,00,000 happened in B = 228 on 16th September and the scaling factor C = 50 was a sort of ad-hoc choice looking at the fitment of the curve. The R Square value of 96.91% was impressive. As per the assessment COVID Daily New Cases will cease to occur by end of 2020. Many countries are reporting a second peak, but I am skeptical about it.

As a cautionary note one should not confuse the above Normal Curve fitment with fitment of Normal Distribution.  The Histogram of Daily Confirmed looks far away from being Normal.

QUESTION I: How the SIR Model could be used for Prediction / Forecasting?
QUESTION II: What are the other curves that could be fit to COVID Daily Confirmed Data?




Mon December 21, 2020 01:20 AM

Hi Joseph, I read your post. It's a lovely application of Branching Process. Given estimates of only two parameters k and p one build the model and use math analysis or simulation to dig out insights. I think epidemiologists are already doing it and I shall suggest you have a look at them it please.

Tue December 15, 2020 11:12 PM

Thanks Joseph. May I have the URL of your post please.

Tue December 15, 2020 10:58 PM

Hello Moloy De

I noticed that SIR Epidemic Model sighted in  your post. Can you please look at my post  and give me your insight on the conceptual approach.  You can also reach me at this  
Thanks in advance.

Wed November 04, 2020 12:27 AM

This is interesting. I hope the daily cases follow this normal curve ahead as well!