MAT 2379 : Introduction to Biostatistics
The learning objectives are broken down in broad objectives and specific objectives.
-
Broad Course Objectives:
- Learn the language of probability theory.
- Understand basic principles of statistical inference from a frequentist point of view.
- Build a starter statistical toolbox with appreciation for both the utility and limitations of these techniques.
- Use software to do statistics.
-
Specific Course Learning Objectives for each topic:
-
Introduction to Probability:
- Compute probabilities of random events encountered in biological applications, in particular in genetics.
- Identify situations leading to intersections or union of events and compute the associated probabilities
- Compute conditional probabilities for examples encountered in the biological sciences
- Compute and interpret various rates associated with diagnostic tests in medical studies (e.g. sensitivity, specificity, positive/negative
predictive value).
- Compute probabilities associated with independent events.
-
Probability Models:
- Set up and work with a discrete random variable as a representation of a population. Compute the population mean, the population
standard deviation, and the probability of event concerning a population.
- Identify a binomial experiment in the context of a real world problem. Compute the mean, the standard deviation, and the
probability of event concerning a binomial random variable with a hand-held calculator, and with statistical software.
- Use a normal distribution to model a population. Find the probability of an event and a quantile of a normal distribution using a
normal probability table and using statistical software.
-
Descriptive Statistics
- Calculate with a hand-held calculator and with statistical software various descriptive measures for central tendency: mean,
median, geometric mean.
- Calculate with a hand-held calculator and with statistical software various descriptive measures for central tendency: standard
deviation, range, IQR, geometric standard deviation.
- Identify outliers.
- Construct histogram and use the plot to describe the shape of the distribution of a quantitative variable (using statistical software).
- Construct comparative boxplots (using statistical software), and use theses plots to compare the distribution between groups of
numerical values.
- Assess the normality of a quantitative variable with a quantile-quantile plot or a normal probability plot (constructed using
statistical software).
-
Statistical Inference
- Construct a confidence interval for the mean of a population and perform a test of hypotheses for the mean of a population, under
the assumption that the variance is known or unknown
- Interpret the errors associated with a test of hypothesis, and report the conclusion of this test based on the associated p-value
- Construct a confidence interval for the proportion of individuals with a certain biological characteristic in a population, and
perform a test of hypothesis for this proportion
- Compare the means of two independent populations by means of confidence intervals and tests of hypothesis, by first assessing the
assumptions of normality and equality of variances (using statistical software)
- Compare the means of dependent (or paired) data using confidence intervals and tests of hypothesis
-
Simple Linear Regression and Correlation:
- Use a scatter plot to describe the association between two quantitative variables.
- Compute and use a sample correlation to describe the strength and direction of the linear association between two quantitative
variables.
- Find the simple linear regression model and be able to interpret the slope and y-intercept.
- Predict values of new response using the simple linear regression model.
During the semester, we will suggest exercises from on Online Bank of Questions.
Here is a link to the
online bank of questions.
- Sept. 4:
-
For questions concerning the various approaches to probability, try the exercices 24 to 27 in the Probability Section.
- For questions concerning elementary genetics, try the exercices 33 to 34 in the Probability Section.
- Sept. 9: For questions concerning the axioms and rules of probability probability, try the exercices 2, 6, 7, 8, 14, 15, 18 in the Probability Section.
- Sept. 11: For questions concerning conditional probability, try the exercices 9, 10 in the Probability Section.
- Sept. 16:
- For questions concerning the multiplication rule, the total probability rule and Bayes' rule, try the exercices 1, 5, 13, 19, 20, 22, 23 in the Probability Section.
- For questions concerning diagnostic tests of Chapter 4, try the exercices 3, 4, 16, 17 in the Probability Section
- Sept. 18:
- For questions concerning independence of events, try the exercices 11, 12, 21 in the Probability Section.
- Sept. 23:
- For questions concerning a discrete random variable, try the exercices 5, 15, 18, 22, 23, 24, 25 in the Random Variables.
- Sept. 25:
- For questions concerning a binomial random variable, try the exercices 1, 2, 6, 16, 19 in the Random Variables.
- For questions concerning a normal random variable, try the exercices 3, 9, 10, 11, 12, 17, 20, 21 in the Random Variables.
- Sept. 30: Suggested Exercises:
- Let X be a binomial random variable with n=1000 and
p=0.25. Compute the following probabilities with R.
- P(X=250).
- P(200 ≤ X ≤ 250).
- P(X>200).
Answers:
- P(X=250) = f(250)=0.02912411;
- P(200 ≤ X ≤ 250)=F(250)-F(199)=0.5169063;
- P(X>200)=1-P(X ≤ 200)=1-F(200)=0.999891.
With R:
> dbinom(250,1000,.25)
[1] 0.02912411
> pbinom(250,1000,.25)-pbinom(199,1000,.25)
[1] 0.5169063
> 1-pbinom(200,1000,.25)
[1] 0.999891
- Suggested Exercises: Let X be a normal random variable with mean 125 and standard deviation 45.2.
Compute the following probability with R. (Hint: Use the pnorm function.)
- P(130 < X <137).
- P(X > 124).
Answer:
- P(130 < X <137) = F(137)-F(130)=0.06064179;
- P(X > 124) =1-F(124)=0.5088254;
With R:
> pnorm(137,125,45.2)-pnorm(130,125,45.2)
[1] 0.06064179
> 1-pnorm(124,125,45.2)
[1] 0.5088254
Determine (with the qnorm function) a value x such that:
- P(X > x) = 25%.
- P(X < x) = 25%.
Answer:
- Since P(X > x) = 25% is equivalent to F(x)=0.75, then x=155.4869.
- Since P(X < x) = 25% is equivalent to F(x)=0.25, then, x=94.51306.
Avec R:
> qnorm(0.75,125,45.2)
[1] 155.4869
> qnorm(0.25,125,45.2)
[1] 94.51306
- Oct 7:
- For questions concerning the computation of descriptive statistics, try the exercices 1 to 4 and also 7 to 12 in the Descriptive Statistics section.
- For questions concerning the comparitive boxplots, try the exercices 5 and 6 in the Descriptive Statistics section
- Oct 23:
- For questions concerning a log transformation or a linear transformation, try the exercices 13 to 17 in the Descriptive Statistics section.
- Oct 28:
- For questions concerning the sampling distribution of the sample mean, try the exercices 4, 5, 7 in the
Sampling Distribution section.
- For questions concerning the linear combination of random variables, try the exercices 13, 14 in the Random Variables section.
- Oct 30:
- For questions concerning quantile-quantile plots (qq-plots), try the exercices 22 to 25 in the
Descriptive Statistics section.
- For questions concerning the point estimation of the mean, try the exercice 2 in the Sampling Distribution section.
- Nov 4:
- For questions concerning the point estimation of a proportion, try the exercice 1 in the Sampling Distribution section.
- For questions concerning the T distribution, try the exercices 3 and 6 in the Sampling Distribution section.
- For questions concerning the interval estimation of a mean or of a proportion, try the exercices 1 to 11 in the Confidence intervals section.
- Nov 7:
- For questions concerning the formulation of hypotheses, try the exercice 2, 11, 18 in the Hypothesis Testing section.
- For questions concerning the types of errors with hypothesis testing, try the exercices 4, 5 in the Hypothesis Testing section.
- Nov 11:
- For questions concerning hypothesis testing, try the exercices 1 to 19 in the Hypothesis Testing section.
- Nov 20:
- For questions concerning the verification of normality and equality of variance for independent populations, try the exercices 18 to 19 in the Descriptive Statistics section.
- For questions concerning comparing the means of two independent populiations, try the exercices 1, 4, 6, 7, 10, 11, 12, 13, 14, 16 in the Comparing Means section.
- Nov 28:
- For questions concerning paired measurements, try the exercices 2, 3, 5, 8, 9 in the Comparing Means section.
- For questions concerning correlation and regression, try the exercices 1, 2, 5, 6, 7, 8, 9, 10, 11, 12 in the Regression section.
Here is the schedule for the assignments of MAT 2379 (fall 2024).
- Assignment 1
Deadline: Before 11:59 pm on Friday, Sept. 27.
- Assignment 2
Deadline:Before 11:59 pm on Friday, Oct. 11.
- Assignment 3
Deadline: Before 11:59 pm on Friday, Nov. 8.
- Assignment 4
Deadline: Before 11:59 pm on Wednesday, Nov. 27.
Here are resources for R
.
Here is a web page with examples concerning R
and links to Youtube videos.