Making Sense of Biostatistics

A population proportion is the percentage of members in that population who share one or more common features. For example, the proportion of females the U.S. population is about 51%. In contrast, the proportion in China is about 49%. To estimate the difference between two population proportions with a confidence interval, you can use the Central Limit Theorem when the sample sizes contain at least 30 in each population. The Central Limit Theorem states that the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined expected value and well-defined variance, will be approximately normally distributed, regardless of the underlying distribution.1 In other words, while the arithmetic means of the samples will vary, the distribution of those numbers will be normally distributed, i.e., in a bell curve. The normal distribution is important because it enables the use of various statistical methods, such as the one described below. The Central Limit Theorem describes the characteristics of a “population of the means” that could, in theory, be created from the means of an infinite number of random population samples of size (n), all of them drawn from a given “parent population.” The Central Limit Theorem predicts that, regardless of the distribution of the parent population: The mean of the population of means is always equal to the mean of the parent population from which the population samples were drawn.  The standard deviation of the population of means is always equal to the standard deviation of the parent population divided by the square root of the sample size (n).  The distribution of means will increasingly approximate a normal distribution as the size N of samples increases. When a statistic variable is used to compare two categories, e.g., the differences between the two population proportions, you can estimate the difference between them, p1 – p2, by taking a sample from each population and using the difference of the two sample proportions, p1 – p2, plus or minus a margin of error. The resulting range is called a confidence interval (CI) for the difference of two population proportions, p1 – p2. The confidence interval for the difference between the two population proportions can be calculated using the following modified Central Limit Theorem formula:

where p1 and n1 are the sample proportion and sample size of the first sample, and p2 and n2 are the sample proportion and sample size of the second sample. The value z* is the appropriate value from the standard normal distribution for your desired confidence level that may be obtained from a z*-value table. Example3 In a clinical trial, you want to estimate with 95% confidence the difference between the percentage of all females who experienced nausea and the percentage of all males who experienced nausea when taking the investigational drug. If the difference is significant, the drug might be more suitable for one gender than the other. 1. Because you want a 95% confidence interval, your z*-value (from the table) is 1.96. 2. Suppose your random sample of 100 females includes 53 females who experienced nausea when taking the investigational drug, and your sample of 110 males includes 37 males who experienced nausea, so p1 is 53 / 100 = 0.53 and p2 is 37 / 110 = 0.34 3. The difference between these sample proportions (females – males) is 0.53 – 0.34 = 0.19. 4. Take 0.53 ∗ (1 – 0.53) to obtain 0.2941. Then divide that by 100 to get 0.0025. Then take 0.34 ∗ (1 – 0.34) to obtain 0.2244. Then divide that by 110 to get 0.0020. Add these two results to get 0.0025 + 0.0020 = 0.0045. Then find the square root of 0.0045, which is 0.0671. 5. 1.96 ∗ 0.0671 gives you 0.13, or 13%, which is the margin of error. 6. Your 95% confidence interval for the difference between the percentage of females who experienced nausea and the percentage of males who experienced nausea is 0.19 or 19% (which you got in Step 3), plus or minus 13% (which you got in Step 4). The lower end of the interval is thus 0.19 – 0.13 = 0.06 or 6% and the upper end is 0.19 + 0.13 = 0.32 or 32%. From your analysis of the clinical trial data, you can now say with 95% confidence that, based on your sample: A higher percentage of females than males experienced nausea when taking the investigational drug. The difference in these percentages is somewhere between 6% and 32%.

References

1. http://en.wikipedia.org/wiki/Central_limit_theorem

2. http://www.chem.uoa.gr/applets/appletcentrallimit/appl_centrallimit2.html

3. http://www.dummies.com/how-to/content/how-to-estimate-the-difference-betweentwo-proport.html

Author Melissa Pressman, PhD, is the Senior Manager of Clinical Trials at Insys Therapeutics in Phoenix, Arizona. She is also an Associate Professor of Research at the University of Arizona – College of Medicine (Phoenix Campus) and Grand Canyon University. Contact her at 1.480.280.1572 or melissa.pressman@mihs.org.

Making Sense of Biostatistics

0 comments:

Post a Comment

About Blogger:

My sites

Labels

Total Pageviews

Subscribe to us

Making Sense of Biostatistics

RELATED POSTS

0 comments:

Post a Comment

About Blogger:

My sites

Labels

Total Pageviews

Subscribe to us

100% Free.

No Spamming.

Unsubscribe anytime.