Global Data Science Forum

 View Only

How can I compute this " P" probability to determine how infection is spread via epidemic within a contact network.

By Joseph Umeana posted Thu November 26, 2020 11:46 PM


Here is what I need 
1. Algorithm (conceptual approach to this solution (P)  ), modelling approach, sample data elements to compute the probability. I am currently using R programming environment.

Within the context of Network Analysis the patterns by which epidemics spread through groups of people is determined not just by the properties of the pathogen carrying it — including its contagiousness, the length of its infectious period, and its severity — but also by network structures within the population it is affecting. The social network within a population — recording who knows whom — determines a lot about how the disease is likely to spread from one person to another. But more generally, the opportunities for a disease to spread are given by a contact network: there is a node for each person, and an edge if two people come into contact with each other in a way that makes it possible for the disease to spread from one to the other. This suggests that accurately modeling the underlying network is crucial to understanding the spread of an epidemic. Contact networks are also important in understanding how diseases spread through animal populations

The pathogen and the network are closely intertwined, even within the same population, the contact networks for two different diseases can have very different structures, depending on the diseases’ respective modes of transmission. For a highly contagious disease, involving airborne transmission based on coughs and sneezes, the contact network will include a huge number of links, including any pair of people who sat together on a bus or an airplane. For a disease requiring close contact, or a sexually transmitted disease, the contact network will be much sparser, with many fewer pairs of people connected by links.

Technical Merit -   Disease and the Network that transmit them

Connections to the Diffusion of Ideas and Behaviors. There are clear connections between epidemic disease and the diffusion of ideas through social networks. Both diseases and ideas can spread from person to person, across similar kinds of networks that connect people, and in this respect, they exhibit very similar structural mechanisms — to the extent that the spread of ideas is often referred to as “social contagion”

We begin with perhaps the simplest model of contagion, which we refer to as a  
                                                     Branching process model. It works as follows.

  • (First wave.) Suppose that a person carrying a new disease enters a population, and transmits it to each person he meets independently with a probability of p. Further, suppose that he meets k people while he is contagious; let’s call these k people the first wave of the epidemic. Based on the random transmission of the disease from the initial person, some of the people in the first wave may get infected with the disease, while others may not.
  • (Second wave.) Now, each person in the first wave goes out into the population and meets k different people, resulting in a second wave of k · k = k 2 people. Each infected person in the first wave passes the disease independently to each of the k second-wave people they meet, again independently with probability p.
  • (Subsequent waves.) Further waves are formed in the same way, by having each person in the current wave meet k new people, passing the disease to each independently with probability p.



Tue December 15, 2020 11:05 PM

Probability distributions

R as a set of statistical tables

One convenient use of R is to provide a comprehensive set of statistical tables. Functions are provided to evaluate the cumulative distribution function P(X ≤ x), the probability density function and the quantile function (given q, the smallest x such that  P(X ≤ x) > q), and to simulate from the distribution.

Prefix the name given here by ‘d’ for the density, ‘p’ for the CDF, ‘q’ for the quantile function and ‘r’ for simulation (random deviates)

Distribution   R name     additional arguments

beta                beta          shape1, shape2, ncp

 binomial       binom        size, prob

Cauchy          Cauchy        location, scale

chi-squared   chisq            df, ncp

exponential   exp               rate

F                       f                df1, df2, ncp

Normal        norm             mean, sd

Student’s t    t                   df, ncp

F-Distribution Tables (

Here are some examples

> ## 2-tailed p-value for t distribution

> 2*pt(-2.43, df = 13)

> ## upper 1% point for an F(2, 7) distribution

 > qf(0.01, 2, 7, lower.tail = FALSE)

Mon December 14, 2020 09:19 PM

How to calculate p-value from test statistic?

 To determine the p-value, you need to know the distribution of your test statistic under the assumption that the null hypothesis is true . Then, with help of the cumulative distribution function ( cdf ) of this distribution, we can express the probability of the test statistics being values at least as extreme as its value x for the sample:

How do you calculate the p-value in a hypothesis test?

The p-value is calculated using the sampling distribution of the test statistic under the null hypothesis, the sample data, and the type of test being done (lower-tailed test, upper-tailed test, or two-sided test). The p-value for: a lower-tailed test is specified by: p-value = P(TS ts | H 0 is true) = cdf(ts)