One of the fundamental tasks in science and other disciplines is
using a sample of data to understand aspects of a larger population. The
conclusions made about that population use a method referred to as
**statistical inference**. Although seemingly backwards,
the logic behind statistical inference is to reject a research claim
which is not of interest (thereby validating the claim of interest). For
example, in order to show that one diet is better than the other, you
first reject the idea that the two diets are equally effective at
causing weight loss.

Statistical inference plays a role in many different data analyses (including the linear models youâ€™ve seen in previous tutorials). In this tutorial, we introduce the idea of a p-value which can be thought of as the degree of disagreement between the data and the research claim. We also introduce confidence intervals which provide a range of plausible values for the measure of interest (e.g., the number of additional pounds lost, on average, for the first diet as compared to the second diet).

- Translate research hypotheses into statistical hypotheses
- Perform a randomization test for two proportions using the
**infer**package - Interpret the results from the randomization test in the context of the research
- Describe different (and their impact on the research conclusions) errors that might have been present
- Create a bootstrap confidence interval for a single proportion using
the
**infer**package.

The fundamental principle that there is variability across different samples will be investigated in this lesson. You will be able to characterize the variability across samples as compared to the underlying population. Remember, the goal of the research is to understand the population, while the information you have comes only from a single sample of data.

Using the **infer** package, you will be able to work
through an entire randomization test. For a given dataset and research
question, you will identify hypotheses and decide whether or not it is
appropriate to reject the null hypothesis in favor of the research claim
of interest.

Working through another complete hypothesis test (using the
**infer** package), you will focus on how and when false
conclusions are sometimes drawn. You will learn to distinguish between a
Type I error where you are confident in a claim that isnâ€™t valid and a
Type II error where you fail to conclude anything when your science is
correct. Additionally, you will learn the important role that number of
observations play in reducing error rates.

Additionally, using the **infer** package, you will
learn to create confidence intervals for a single population proportion.
A confidence interval will provide a range of plausible values to
estimate the unknown population value. The parameter is the true unknown
proportion of successes in the population, estimated using the sample
proportion. Bootstrapping allows for estimation of the variability of
the sample proportion which leads to an interval estimate.